Targeted learning via probabilistic subpopulation matching

Jie Hu Co-Author
University of Pennsylvania
 
Naimin Jing Co-Author
Merck & Co.
 
Yang Ning Co-Author
Cornell University
 
Cheng Yong Tang Co-Author
Temple University
 
Runze Li Co-Author
Penn State University
 
Yong Chen Co-Author
University of Pennsylvania, Perelman School of Medicine
 
Xiaokang Liu First Author
University of Missouri
 
Xiaokang Liu Presenting Author
University of Missouri
 
Thursday, Aug 7: 10:05 AM - 10:20 AM
2699 
Contributed Papers 
Music City Center 
To get more accurate prediction results from a target study, transfer knowledge from similar source studies is proved to be useful. However, in many real-world biomedical applications, populations in different studies, e.g., clinical sites, can be heterogeneous, causing challenges in properly borrowing information towards the target study. If using study-level matching to identify similar source studies, samples from source studies that significantly differ from the target study will all be dropped at the study level, which can lead to substantial information loss. We consider a general situation where all studies are sampled from a super-population composed of distinct subpopulations, and propose a novel framework of targeted learning via subpopulation matching. We first fit a finite mixture model jointly across all studies to get subject-wise probabilistic subpopulation information, and then transfer knowledge from source studies to the target study within each identified subpopulation. By measuring similarities between subpopulations, our method effectively decomposes between-study heterogeneity and allows knowledge transfer from all source studies without dropping any samples.

Keywords

Finite mixture model

Generalized linear regression

Subpopulation structure

Transfer learning 

Main Sponsor

Biometrics Section