Accounting for Unobserved Confounding to Reduce False Discoveries in Microbiome Research

Meghan Shilts Co-Author
Vanderbilt University Medical Center
 
Zhouwen Liu Co-Author
Vanderbilt University Medical Center
 
Tebeb Gebretsadik Co-Author
Vanderbilt University, School of Medicine
 
Christian Rosas-Salazar Co-Author
Vanderbilt University Medical Center
 
Suman Das Co-Author
Vanderbilt University Medical Center
 
Tina Hartert Co-Author
Vanderbilt University Medical Center
 
Chris McKennan Co-Author
The University of Chicago
 
Yu Shyr Co-Author
Vanderbilt University Medical Center
 
Siyuan Ma Co-Author
 
Chih-Ting Yang First Author
Vanderbilt University
 
Chih-Ting Yang Presenting Author
Vanderbilt University
 
Wednesday, Aug 7: 8:45 AM - 8:50 AM
2452 
Contributed Speed 
Oregon Convention Center 
Microbiome research often conducts differential abundance analysis (DA) to identify microbial features associated with covariates of interest. Recently, concerns with false discoveries from DA have increased, and related statistical research usually attributes this to compositionality (microbial abundances are relative). In this work, we examine another potential cause: unobserved, microbiome-wide confounding (e.g., population structures, unmeasured technical effects). Such effects, often ignored during DA, have been noted to inflate false discovery rates (FDR) in molecular epidemiology, where research shows low-dimensional factor structures of the data can act as surrogates for confounding and be adjusted for to control FDR. We demonstrate systemic, real-data-based evidence that unobserved confounding consistently inflates FDR in microbiome DA. However, existing factor-based correction methods with simple modifications can effectively address this. We implement such methods with open-source software, to be conveniently integrated with existing DA. Our work is one of the first efforts to evaluate and correct for unobserved confounding to control FDR in microbiome DA.

Keywords

False discovery rate

Unobserved confounding

Microbiome

Differential abundance

Latent factor models 

Main Sponsor

Section on Statistics in Genomics and Genetics