Microbiome Data Integration via Shared Dictionary Learning
Abstract Number:
991
Submission Type:
Contributed Abstract
Contributed Abstract Type:
Paper
Participants:
Bo Yuan (1), Shulei Wang (1)
Institutions:
(1) N/A, N/A
Co-Author:
First Author:
Presenting Author:
Abstract Text:
Data integration is a powerful tool for facilitating a comprehensive understanding of microbial communities and their association with outcomes of interest. However, integrating data sets from different studies remains a challenging problem because of severe batch effects, unobserved confounding variables, and high heterogeneity across data sets. We propose a new data integration method called MetaDICT, which initially estimates the batch effects by weighting methods in causal inference literature and then refine the estimation via a novel shared dictionary learning. Compared with existing methods, MetaDICT can better avoid the overcorrection of batch effects and preserve biological variation when there exist unobserved confounding variables or data sets are highly heterogeneous across studies. Applications to synthetic and real microbiome data sets demonstrate the robustness and effectiveness of MetaDICT in integrative analysis. Using MetaDICT, we characterize microbial interaction, identify generalizable microbial signatures, and enhance the accuracy of disease prediction in an integrative analysis of colorectal cancer metagenomics studies.
Keywords:
data integration|shared dictionary learning|batch effect|microbiome|embedding|
Sponsors:
Section on Statistics in Genomics and Genetics
Tracks:
Miscellaneous
Can this be considered for alternate subtype?
Yes
Are you interested in volunteering to serve as a session chair?
Yes
I have read and understand that JSM participants must abide by the Participant Guidelines.
Yes
I understand that JSM participants must register and pay the appropriate registration fee by June 3, 2025. The registration fee is non-refundable.
I understand
You have unsaved changes.