Simultaneous inference for generalized linear models with unmeasured confounders

Larry Wasserman Co-Author
Carnegie Mellon University
 
Kathryn Roeder Co-Author
Carnegie Mellon University
 
Jin-Hong Du First Author
Carnegie Mellon University
 
Jin-Hong Du Presenting Author
Carnegie Mellon University
 
Monday, Aug 5: 10:05 AM - 10:20 AM
2080 
Contributed Papers 
Oregon Convention Center 
Tens of thousands of simultaneous hypothesis tests are routinely performed in genomic studies to identify differentially expressed genes. However, due to unmeasured confounders, many standard statistical approaches may be substantially biased. This paper investigates the large-scale hypothesis testing problem for multivariate generalized linear models in the presence of confounding effects. Under arbitrary confounding mechanisms, we propose a unified statistical estimation and inference framework that harnesses orthogonal structures and integrates linear projections into three key stages. It begins by disentangling marginal and uncorrelated confounding effects to recover latent coefficients. Then, latent factors and primary effects are jointly estimated by lasso-type optimization. Finally, we incorporate bias-correction steps for hypothesis testing. Theoretically, we establish identification conditions, non-asymptotic error bounds and effective Type-I error control as sample and response sizes approach infinity. By comparing single-cell RNA-seq counts from two groups of samples, we demonstrate the suitability of adjusting confounding effects when significant covariates are absent.

Keywords

Hidden variables

Surrogate variables analysis

Multivariate response regression

Hypothesis testing 

Main Sponsor

IMS