Semi-Supervised Calibration of Inferred Outcomes for Estimating Average Treatment Effects

David Cheng Speaker
 
Daniel Xu Co-Author
University of Pennsylvania
 
Katherine Liao Co-Author
Brigham and Women's Hospital
 
Tianxi Cai Co-Author
Harvard University
 
Monday, Aug 3: 3:20 PM - 3:35 PM
2952 
Contributed Papers 
Thomas M. Menino Convention & Exhibition Center 
Estimating treatment effects with EHR data is complicated by the lack of validated outcome measures collected under standardized protocols. Typically outcomes are approximated using putative rules or phenotyping models that are trained on small sub-samples. Errors in the inferred outcomes can introduce biases in downstream treatment effect estimates. We develop a semi-supervised method to calibrate inferred outcomes for estimation of average treatment effects when a sub-sample is labeled at random with validated outcomes. The calibration ensures that the subsequent treatment effect estimator remains consistent despite errors in the inferred outcomes. This problem is analogous to that of estimating mean outcomes in a longitudinal study subject to monotone missingness, and we demonstrate connections with existing augmented inverse probability weighting estimators. The proposed estimator is multiply robust and locally semiparametric efficient. It can achieve efficiency gains in finite samples due to an effective normalization of implicit augmentation terms. The performance is evaluated in simulations and illustrated in an analysis of anti-TNF therapies in rheumatoid arthritis.

Keywords

Comparative effectiveness

Electronic health records

Semiparametric efficiency

Multiple robustness

Semi-supervised learning 

Main Sponsor

Section on Statistics in Epidemiology