10 - Batch effect in microbiome data
Conference: Women in Statistics and Data Science 2022
10/07/2022: 2:30 PM - 4:00 PM CDT
Speed
Room: Grand Ballroom Salon G
Microbiome study have been gaining enormous popularity among scientist to characterize human health and disease. While many statistical analysis tools work well in most high-dimensional data similarly, such as gene-expression data, there is a need to pay attention to the compositionality in microbiome data meaning relative abundances based on taxon counts. With such data, reproducibility is difficult to achieve, we aim to examine the batch effect, i.e., systematic bias from datasets collected at different sites or times. In microbiome experiments, combining several data sets is often considered for the sake of statistical power, hoping to discover reliable biomarker and establish more robust prognostic models. The unique challenge in microbiome data, however, is the sum-to-one constraint, that is, the relative abundance is vulnerable to a different set of microbiotas from a different experiment. For example, certain transformation in Euclidean space is not robust to the sub-compositionality. Therefore, simply adding samples from a different subset of features is rather at the risk of misleading than gaining a power. In this talk, we aim to provide the helpful advice for the use of the statistical methods under the multi-batch situations including sub-compositionality, false-discovery rate and dependency among features.
microbiome
high dimensional compositional data
batch effect
subcompositionaltiy
false discovery rate
reproducibility
Presenting Author
Jung Ae Lee, University of Massachusetts Chan Medical School
First Author
Jung Ae Lee, University of Massachusetts Chan Medical School
Target Audience
Mid-Level
Tracks
Knowledge
Women in Statistics and Data Science 2022
You have unsaved changes.