Federated multimodal learning with heterogeneous modality and distribution shift

Huiyuan Wang Co-Author
University of Pennsylvania
 
Jingyue Huang Co-Author
 
Yong Chen Co-Author
University of Pennsylvania, Perelman School of Medicine
 
Dazheng Zhang First Author
 
Dazheng Zhang Presenting Author
 
Sunday, Aug 3: 4:50 PM - 5:05 PM
2432 
Contributed Papers 
Music City Center 
Federated learning enables the analysis of multi-site real-world data (RWD) while preserving data privacy, yet challenges persist due to heterogeneous modality availability and distribution shifts across sites. In this work, we develop a novel federated multimodal learning framework to improve causal inference in distributed research networks (DRNs), integrating electronic health records (EHRs) and genetic biomarkers. Traditional methods often fail to account for structural missingness and site-specific heterogeneity, leading to biased estimates of treatment effects.
To address this, we propose a new statistical framework that accounts for distribution shifts of populations across sites, while pursuing efficiency and bias correction by leveraging information from all available modalities across sites. In addition, we employ multiple negative control outcomes to calibrate estimates and mitigate residual systematic biases, including unmeasured confounding.

Keywords

Causal inference

Negative control outcomes

Average treatment effect

Bias Correction

Multi-Modality 

Main Sponsor

Section on Statistical Learning and Data Science