Wald-Based Logistic Regression for Differential Abundance Analysis in Large-Scale Microbiome Data

Abstract Number:

2149 

Submission Type:

Contributed Abstract 

Contributed Abstract Type:

Paper 

Participants:

Mengyu He (1), Yijuan Hu (2), Glen Satten (3)

Institutions:

(1) Emory University, Rollins School of Public Health, N/A, (2) Emory University, Department of Biostatistics & Bioinformatics, N/A, (3) Emory University School of Medicine, N/A

Co-Author(s):

Yijuan Hu  
Emory University, Department of Biostatistics & Bioinformatics
Glen Satten  
Emory University School of Medicine

First Author:

Mengyu He  
Emory University, Rollins School of Public Health

Presenting Author:

Mengyu He  
Emory University, Rollins School of Public Health

Abstract Text:

Recent advances in sequencing technologies have vastly increased microbiome data availability and depth, posing significant computational and statistical challenges. While LOCOM provides strong FDR control and high sensitivity for differential abundance testing, its permutation-based framework becomes computationally expensive at large scales. Moreover, large datasets frequently exhibit batch effects and substantial library size variations, potentially confounding disease associations. Because LOCOM's likelihood-based estimation inherently upweights high-depth samples, these disparities can further bias results.

We propose a computationally efficient alternative that replaces permutation-based inference with a Wald test and introduces an M-estimator-based framework for more balanced weighting. In addition to supporting equal weighting to mitigate biases, our approach accommodates relative abundance data, whereas LOCOM only accepts count data-offering greater flexibility for diverse microbiome analyses.

Through realistic simulations, we show that our method is computationally efficient and offers robust FDR control, making it well-suited for large-scale microbiome analysis.

Keywords:

large-scale microbiome data|differential abundance testing|M-estimator|FDR Control| relative abundance data|

Sponsors:

Section on Statistics in Genomics and Genetics

Tracks:

Miscellaneous

Can this be considered for alternate subtype?

Yes

Are you interested in volunteering to serve as a session chair?

Yes

I have read and understand that JSM participants must abide by the Participant Guidelines.

Yes

I understand that JSM participants must register and pay the appropriate registration fee by June 3, 2025. The registration fee is non-refundable.

I understand