Using sufficiency and sparsity for more powerful
controlled variable selection in the linear model
Wednesday, Aug 6: 3:05 PM - 3:20 PM
1062
Contributed Papers
Music City Center
We show that for the problem of controlled variable selection in the Gaussian linear
model, informative and valid weights (for weighted multiple testing) can be derived
entirely from sufficient statistics and a belief in sparsity using only the data itself and
no external quantitative side information. This idea results in new procedures with
strict guarantees on the (unweighted) familywise error rate or false discovery rate and
that are more powerful than existing methods when the model is sparse. A naive
implementation of our idea is computationally intensive, so we propose computational
improvements that maintain strict validity while having little impact on the power.
We show that the same idea extends asymptotically to any setting with a Gaussian
limit and consistently estimable covariance matrix, such as any M-estimation problem.
We demonstrate the performance of our methods in simulations and an application to
HIV drug resistance.
Variable selection
Weighted multiple testing
Sparsity
Familywise error rate
False discovery rate
Main Sponsor
Section on Statistical Learning and Data Science
You have unsaved changes.