Assessing treatment effects in observational data with missing or mismeasured confounders: A comparative study of practical doubly-robust and traditional missing data methods

Brian Williamson Co-Author
Kaiser Permanente Washington Health Research Institute
 
Chloe Krakauer Co-Author
 
Eric Johnson Co-Author
Kaiser Permanente Washington Health Research Institute
 
Susan Gruber Co-Author
TL revolution, LLC
 
Bryan Shepherd Co-Author
Vanderbilt University, School of Medicine
 
Mark Van Der Laan Co-Author
UC Berkeley
 
Thomas Lumley Co-Author
University of Auckland
 
Hana Lee Co-Author
Food and Drug Administration
 
José Hernández-Muñoz Co-Author
Food and Drug Administration
 
Fengyu Zhao Co-Author
FDA, CDER
 
Sarah Dutcher Co-Author
Food and Drug Administration
 
Rishi Desai Co-Author
Brigham and Women’s Hospital, Harvard Medical School
 
Gregory Simon Co-Author
Kaiser Permanente Washington Health Research Institute
 
Susan M Shortreed Co-Author
Kaiser Permanente Washington Health Research Institute
 
Jennifer Nelson Co-Author
Kaiser Permanente Health Research Institute
 
Pamela Shaw Speaker
Kaiser Permanente Washington Health Research Institute
 
Wednesday, Aug 6: 9:15 AM - 9:35 AM
Invited Paper Session 
Music City Center 
For safety and rare outcome studies in pharmacoepidemiology, multiple, large databases are often merged to improve statistical power and create a more generalizable cohort. Medical claims data have become a mainstay in evaluating the safety and effectiveness of medications post-approval, but confounders derived from administrative data can be prone to measurement error. Electronic health records (EHR) data or data abstracted from chart review have more granular patient data than do medical claims, but the gold standard exposure data may only be available on a subset. I will discuss two practical-to-implement doubly-robust estimators for this setting, one relying on a type of survey calibration and another utilizing targeted maximum likelihood estimation (TMLE), and compare their performance with that of more traditional missing data methods in a detailed numerical study. Numerical work includes plasmode simulation studies that emulate the complex data structure of a real large electronic health records cohort in order to compare anti-depressant therapies in a setting where a key confounder is prone to missingness.

Keywords

doubly-robust methods

missing data

electronic health records

targeted maximum likelihood estimation

generalized raking

survey calibration