Print Close

Semi-Supervised Calibration of Inferred Outcomes for Estimating Average Treatment Eﬀects

Presented During: Section on Statistics in Epidemiology - Measurement Error & Calibration

David Cheng Speaker

Daniel Xu Co-Author
University of Pennsylvania

Katherine Liao Co-Author
Brigham and Women's Hospital

Tianxi Cai Co-Author
Harvard University

Monday, Aug 3: 3:20 PM - 3:35 PM
2952
Contributed Papers

Thomas M. Menino Convention & Exhibition Center

Estimating treatment effects with EHR data is complicated by the lack of validated outcome measures collected under standardized protocols. Typically outcomes are approximated using putative rules or phenotyping models that are trained on small sub-samples. Errors in the inferred outcomes can introduce biases in downstream treatment effect estimates. We develop a semi-supervised method to calibrate inferred outcomes for estimation of average treatment effects when a sub-sample is labeled at random with validated outcomes. The calibration ensures that the subsequent treatment effect estimator remains consistent despite errors in the inferred outcomes. This problem is analogous to that of estimating mean outcomes in a longitudinal study subject to monotone missingness, and we demonstrate connections with existing augmented inverse probability weighting estimators. The proposed estimator is multiply robust and locally semiparametric efficient. It can achieve efficiency gains in finite samples due to an effective normalization of implicit augmentation terms. The performance is evaluated in simulations and illustrated in an analysis of anti-TNF therapies in rheumatoid arthritis.

Keywords

Comparative effectiveness

Electronic health records

Semiparametric efficiency

Multiple robustness

Semi-supervised learning

Main Sponsor

Section on Statistics in Epidemiology