Bayesian GLMs for Analyzing Compositional and Sub-compositional Microbiome Data via EM Algorithm

ZHENYING DING Co-Author
UAB
 
Jinhong Cui Co-Author
UAB
 
Xiaoxiao Zhou Co-Author
Duke University
 
Nengjun Yi Co-Author
University of Alabama at Birmingham
 
Li Zhang First Author
Fox Chase Cancer Center
 
Li Zhang Presenting Author
Fox Chase Cancer Center
 
Thursday, Aug 7: 8:35 AM - 8:50 AM
1229 
Contributed Papers 
Music City Center 
The study of compositional microbiome data is crucial for understanding microbial roles in health and disease. Traditional log-ratio transformations have shifted to methods enforcing a zero-sum constraint on coefficients. However, penalized regression only provides point estimates, while Markov Chain Monte Carlo (MCMC) methods, though accurate, are computationally intensive for high-dimensional data.

We proposed Bayesian generalized linear models for analyzing compositional and sub-compositional microbiome data. The model uses a spike-and-slab double-exponential prior, enabling weak shrinkage for significant coefficients and strong shrinkage for irrelevant ones. The sum-to-zero constraint is handled via soft-centering, applying a prior distribution on the sum of coefficients. A fast and stable algorithm integrates EM steps into the IWLS algorithm to improve computational efficiency.

Extensive simulations show that our method outperforms existing approaches in accuracy and prediction. We applied it to a microbiome study, identifying microorganisms linked to inflammatory bowel disease (IBD). The method is available in the R package BhGLM (https://github.com/nyiuab/BhGLM).

Keywords

Bayesian GLMs

Compositional data

EM algorithm

Microbiome

Sum-to-zero constraint

Spike-and-slab priors 

Main Sponsor

ENAR