Bayesian GLMs for Analyzing Compositional
and Sub-compositional Microbiome Data via EM Algorithm
Nengjun Yi
Co-Author
University of Alabama at Birmingham
Li Zhang
First Author
Fox Chase Cancer Center
Li Zhang
Presenting Author
Fox Chase Cancer Center
Thursday, Aug 7: 8:35 AM - 8:50 AM
1229
Contributed Papers
Music City Center
The study of compositional microbiome data is crucial for understanding microbial roles in health and disease. Traditional log-ratio transformations have shifted to methods enforcing a zero-sum constraint on coefficients. However, penalized regression only provides point estimates, while Markov Chain Monte Carlo (MCMC) methods, though accurate, are computationally intensive for high-dimensional data.
We proposed Bayesian generalized linear models for analyzing compositional and sub-compositional microbiome data. The model uses a spike-and-slab double-exponential prior, enabling weak shrinkage for significant coefficients and strong shrinkage for irrelevant ones. The sum-to-zero constraint is handled via soft-centering, applying a prior distribution on the sum of coefficients. A fast and stable algorithm integrates EM steps into the IWLS algorithm to improve computational efficiency.
Extensive simulations show that our method outperforms existing approaches in accuracy and prediction. We applied it to a microbiome study, identifying microorganisms linked to inflammatory bowel disease (IBD). The method is available in the R package BhGLM (https://github.com/nyiuab/BhGLM).
Bayesian GLMs
Compositional data
EM algorithm
Microbiome
Sum-to-zero constraint
Spike-and-slab priors
Main Sponsor
ENAR
You have unsaved changes.