Active multiple testing with proxy p-values and e-values
Monday, Aug 4: 8:35 AM - 8:40 AM
2704
Contributed Speed
Music City Center
Researchers often lack the resources to test every hypothesis of interest directly or compute
test statistics comprehensively, but often possess auxiliary data from which we can compute
an estimate of the experimental outcome. We introduce a novel approach for selecting which
hypotheses to query a statistic in a hypothesis testing setup by leveraging estimates to compute proxy statistics. Our framework allows a scientist to
propose a proxy statistic, and then query the true statistic with some probability based on
the value of the proxy. We make no assumptions about how the proxy is derived and it can be
arbitrarily dependent with the true statistic. If the true statistic is not queried, the proxy is used
in its place. We characterize "active" methods that produce valid p-values and e-values in this
setting and utilize this framework to create procedures with false
discovery rate (FDR) control. Through simulations and real data analysis of causal effects in
scCRISPR screen experiments, we empirically demonstrate that our proxy framework has both
high power and low resource usage when our proxies are accurate estimates of the respective true statistics.
multiple testing
e-values
false discovery rate (FDR)
active sampling
Main Sponsor
Section on Statistical Learning and Data Science
You have unsaved changes.