Fast and robust invariant generalized linear models

Ndey Isatou Jobe Co-Author
Harvard T.H. Chan School of Public Health
 
Rui Duan Co-Author
 
Parker Knight First Author
 
Parker Knight Presenting Author
 
Thursday, Aug 7: 9:05 AM - 9:20 AM
1341 
Contributed Papers 
Music City Center 
Statistical integration of diverse data sources is an essential step in the building of generalizable prediction tools, especially in precision health. The invariant features model is a new paradigm for multi-source data integration which posits that a small number of covariates affect the outcome identically across all possible environments. Existing methods for estimating invariant effects suffer from immense computational costs or only offer good statistical performance under strict assumptions. In this work, we provide a general framework for estimation under the invariant features model that is computationally efficient and statistically flexible. We also provide a robust extension of our proposed method to protect against possibly corrupted or misspecified data sources. We demonstrate the excellent properties of our method via simulations, and use it to build a transferable kidney disease prediction model using electronic health records from the All of Us research program.

Keywords

data integration

generalizability

precision health

electronic health records 

Main Sponsor

Biometrics Section