PPV-guided cost-effective chart review for model-agnostic RWD-based discovery

Yiwen Lu Co-Author
 
Yong Chen Co-Author
University of Pennsylvania, Perelman School of Medicine
 
Jiayi Tong First Author
 
Yiwen Lu Presenting Author
 
Sunday, Aug 4: 3:10 PM - 3:15 PM
2967 
Contributed Speed 
Oregon Convention Center 
Electronic Health Record (EHR)-based association studies have been commonly used to identify the risk factors associated with patient clinical phenotypes. While EHR-derived phenotypes (i.e., surrogates) have recently been utilized, manual chart reviews remain the gold standard for ensuring the quality of the phenotypes. This process is notably time-consuming and costly. Therefore, determining the optimal subset size for chart review is of great importance. In this paper, we propose a PPV/NPV-guided method to determine the minimum sample size required for chart review, thereby substantially saving the cost. Subsequently, we introduce an augmented estimation procedure that effectively combines the chart reviews with the surrogates to achieve asymptotically unbiased and efficient estimators for the EHR-based association studies. Our approach offers a cost-effective solution that ensures accuracy and efficiency in estimation without explicitly specifying the misclassification mechanism of the surrogates. The robustness of our method is validated through extensive simulation studies and the evaluation of real-world data, utilizing the Flatiron dataset as a benchmark for verification.

Keywords

Electronic Health Record (EHR)

Association study

Outcome dependent sampling 

Main Sponsor

Section on Statistics in Epidemiology