Machine Learning-Driven Mediation CNN (Med-CNN) Model for High-Dimensional Mediation Data

Jasper Zhongyuan Zhang Co-Author
University of Toronto
 
Olli Saarela Co-Author
University of Toronto
 
Divya Sharma Co-Author
University of Toronto/University Health Network
 
Wei Xu Co-Author
University of Toronto
 
Yao Li First Author
 
Jasper Zhongyuan Zhang Presenting Author
University of Toronto
 
Sunday, Aug 3: 5:05 PM - 5:20 PM
1997 
Contributed Papers 
Music City Center 
Complex biological features like the microbiome and gene expressions mediate disease progression by influencing immune and metabolic processes. Understanding these mediation roles is crucial for disease pathogenesis and treatment. However, high-dimensional mediation analysis is challenging due to structural dependencies, correlations, and hierarchical relationships, such as microbial taxonomies and gene pathways. The many mediators also complicate conventional approaches.
We propose Med-CNN, a Convolutional Neural Network (CNN)-based model that integrates biological networks to estimate mediation effects. Network-specific CNN outputs are condensed into an Integrative Mediation Metric (IMM) to capture key biological information while handling high-dimensional data and non-linear interactions. Our model accommodates complex structures and improves interpretability in mediation analysis.
Through simulations, Med-CNN showed lower mediation effect biases compared to conventional methods. In a real data application, it identified a mediation effect between ethnicity and vaginal pH levels, demonstrating its robustness in analyzing high-dimensional mediators.

Keywords

High-dimensional

Mediation analysis

Microbiome

Deep learning 

Main Sponsor

Section on Statistical Learning and Data Science