Multi-Source Conformal Inference Under Distribution Shift

Alexander Levis Co-Author
Carnegie Mellon University
 
Sharon-Lise Normand Co-Author
Harvard Medical School
 
Larry Han Co-Author
Northeastern University
 
Yi Liu First Author
North Carolina State University
 
Larry Han Presenting Author
Northeastern University
 
Tuesday, Aug 6: 2:20 PM - 2:35 PM
2419 
Contributed Papers 
Oregon Convention Center 
Recent years have seen a growing utilization of machine learning models to inform high-stakes decision-making. However, distribution shifts and privacy concerns make it challenging to achieve valid inferences in multi-source environments. We generate distribution-free prediction intervals for a target population, leveraging multiple potentially biased data sources. We derive the efficient influence functions for the quantiles of unobserved outcomes and show that one can incorporate machine learning prediction algorithms in the estimation of nuisance functions while still achieving parametric rates of convergence. Moreover, when conditional outcome invariance is violated, we propose a data-adaptive strategy to weight data sources to balance efficiency gain and bias reduction. We highlight the robustness and efficiency of our proposals for a variety of conformal scores and data-generating mechanisms via extensive synthetic experiments and real data analyses.

Keywords

Conformal prediction

Distribution shift

Federated learning

Missing data

Machine learning

Data integration 

Main Sponsor

IMS