Zinck enables accurate biomarker identification from microbiome data

Zhengzheng Tang Co-Author
University of Wisconsin-Madison
 
Soham Ghosh Co-Author
University of Wisconsin, Madison
 
Guanhua Chen Speaker
University of Wisconsin-Madison
 
Sunday, Aug 3: 4:05 PM - 4:25 PM
Topic-Contributed Paper Session 
Music City Center 
Accurate identification of microbial biomarkers is hindered by the unique characteristics of microbiome data, which often result in excessive false positives and reduced statistical power. To address these challenges, we introduce Zinck, a knockoff-based feature selection framework equipped with a high-fidelity knockoff generator. Zinck effectively captures key properties of microbiome data, including zero inflation, complex correlation structures, high variability, and strong batch effects. Through simulations, we demonstrate Zinck's superior statistical power and its robust control of false positives. In real data applications, Zinck successfully identifies biologically relevant microbial biomarkers for colorectal cancer and inflammatory bowel disease, significantly enhancing disease prediction accuracy.

Keywords

microbiome data

knockoff filters

FDR control for high-dimensional data

compositional data