Wald-Based Weighted Logistic Regression for Differential Abundance Analysis in Microbiome Data

Yijuan Hu Co-Author
Emory University, Department of Biostatistics & Bioinformatics
 
Glen Satten Co-Author
Emory University School of Medicine
 
Mengyu He First Author
Emory University, Rollins School of Public Health
 
Mengyu He Presenting Author
Emory University, Rollins School of Public Health
 
Monday, Aug 4: 12:05 PM - 12:20 PM
2149 
Contributed Papers 
Music City Center 
Recent advances in sequencing technologies have vastly increased microbiome data availability and depth, posing significant computational and statistical challenges. While LOCOM provides strong FDR control and high sensitivity for differential abundance testing, its permutation-based framework becomes computationally expensive at large scales. Moreover, large datasets frequently exhibit batch effects and substantial library size variations, potentially confounding disease associations. Because LOCOM's likelihood-based estimation inherently upweights high-depth samples, these disparities can further bias results. We introduce an M-estimator-based weighted logistic regression for more balanced weighting and use a computationally efficient alternative that replaces permutation-based inference with a Wald test. In addition to supporting equal weighting to mitigate biases, our approach accommodates relative abundance data, whereas LOCOM only accepts count data. Through realistic simulations, we show that our method is computationally efficient and offers robust FDR control.

Keywords

large-scale microbiome data

differential abundance testing

M-estimator

FDR Control

relative abundance data 

Abstracts


Main Sponsor

Section on Statistics in Genomics and Genetics