Domain Adaptation Optimized for Robustness in Mixture populations

Zijian Guo Co-Author
Rutgers University
 
Tianxi Cai Co-Author
Harvard University
 
Molei Liu Co-Author
Columbia University
 
Xin Xiong Co-Author
Harvard University
 
Keyao Zhan First Author
Harvard University
 
Keyao Zhan Presenting Author
Harvard University
 
Tuesday, Aug 5: 10:50 AM - 11:05 AM
1343 
Contributed Papers 
Music City Center 
Integrative analysis of multi-institutional biobank-linked EHR data can advance precision medicine by leveraging large, diverse datasets. Yet, generalizing findings to target populations is difficult due to inherent demographic and clinical heterogeneity. Existing transfer learning often assumes the target shares an outcome model with at least one source, overlooking the mixture populations. Additional challenges arise when we lack of direct observation of the outcome of interest and need to explain population mixtures using a broader set of clinical characteristics. To address these challenges under shifts in both covariates and outcome models, we propose Domain Adaptation Optimized for Robustness in Mixture populations (DORM). Leveraging partially labeled source data, DORM builds an initial target model under a joint source-mixture assumption, then applies group adversarial learning to optimize worst-case performance around the initial target model. A tuning strategy refines this approach when limited target labels are available. Asymptotic results confirm statistical convergence and predictive accuracy, and simulations and real-world studies show DORM surpasses existing methods.

Keywords

Domain adaptation

Multi-source data

Mixture population

Group distributional robustness 

Main Sponsor

Section on Statistical Learning and Data Science