Monday, Aug 4: 8:30 AM - 10:20 AM
0144
Invited Paper Session
Music City Center
Room: CC-Davidson Ballroom A3
Applied
No
Main Sponsor
IMS
Co Sponsors
Caucus for Women in Statistics
General Methodology
Presentations
The e-BH procedure is an e-value-based multiple testing procedure that provably controls the false discovery rate (FDR) under any dependence structure between the e-values. Despite this appealing theoretical FDR control guarantee, the e-BH procedure often suffers from low power in practice. In this paper, we propose a general framework that boosts the power of e-BH without sacrificing its FDR control under arbitrary dependence. This is achieved by the technique of conditional calibration, where we take as input the e-values and calibrate them to be a set of "boosted e-values" that are guaranteed to be no less -- and are often more -- powerful than the original ones. Our general framework is explicitly instantiated in three classes of multiple testing problems: (1) testing under parametric models, (2) conditional independence testing under the model-X setting, and (3) model-free conformalized selection. Extensive numerical experiments show that our proposed method significantly improves the power of e-BH while continuing to control the FDR. We also demonstrate the effectiveness of our method through an application to an observational study dataset for identifying individuals whose counterfactuals satisfy certain properties.
Keywords
Multiple testing
E-value
Conditional calibration
We introduce a generic estimator for the false discovery rate of any model selection procedure, in common statistical modeling settings including the Gaussian linear model, Gaussian graphical model, and model-X setting. We prove that our method has a conservative (non-negative) bias in finite samples under standard statistical assumptions, and provide a bootstrap method for assessing its standard error. For methods like the Lasso, forward-stepwise regression, and the graphical Lasso, our estimator serves as a valuable companion to cross-validation, illuminating the tradeoff between prediction error and variable selection accuracy as a function of the model complexity parameter. This is joint work with Yixiang Luo and Lihua Lei.
Keywords
FDR estimate
We introduce a general framework for augmenting structure-adaptive multiple testing with auxiliary information extracted from large language models (LLMs). Modern multiple testing procedures such as CAMT, SABHA, and OrderShapeEM gain substantial power by leveraging covariates or prior rankings, yet such structural information is often unavailable or domain-specific. We propose to apply LLMs to generate soft priors, such as feature relevance scores or hypothesis orderings, through prompt-based querying of natural language descriptions. These LLM-derived outputs can be directly integrated into multiple testing algorithms without altering their core logic. We demonstrate the effectiveness of this approach through simulations and real-world applications pertinent to identifying proteomics for cardiovascular disease risk, showing that knowledge from LLMs consistently enhances the power while maintaining valid FDR control. Our results highlight LLMs as flexible and powerful tools for automating auxiliary information in high-dimensional statistical inference.
Keywords
Benjamini-Hochberg procedure, Cross-fitting, False discovery rate, Leave-one-out analysis, Multiple testing