The Impact of Compositional Responses on the False Discovery Rate in Microbiome Data

Jung Ae Lee First Author
University of Massachusetts Chan Medical School
 
Jung Ae Lee Presenting Author
University of Massachusetts Chan Medical School
 
Wednesday, Aug 7: 10:10 AM - 10:15 AM
2062 
Contributed Speed 
Oregon Convention Center 
Microbiome data produce output based on the relative abundances of hundreds of taxon counts. Such data, generally referred to as high-dimensional compositional data, are inherently vulnerable to the independence assumption because of the sum-to-constant constraint. Particularly, feature selection with the control of false discovery (FDR) rate needs to be examined due to the assumptions of independent p-values for multiple testing. While log-ratio transformations may satisfy some assumptions, interpreting the transformed variables can be challenging in practice. For practical and useful variable selection, the inference should rely on the original scale with no transformation, naturally leading us to investigate the impact of compositional responses. The literature documents FDR-based inference under dependency; however, this method tends to be conservative, especially when assuming all null hypotheses are true. In this study, we aim to identify the weak dependency conditions under which the usual FDR procedure is effective with compositional responses in microbiome data. We provide guidelines when to modify the FDR procedure with dependency.

Keywords

high-dimensional compositional data


microbiome data

multiple testing

false discovery rate 

Main Sponsor

Biometrics Section