Why and how we validate EHR Data: Making them aud-it they can be
Conference: Women in Statistics and Data Science 2025
11/13/2025: 11:45 AM - 1:15 PM EST
Panel
Data from Electronic health records (EHR) present a huge opportunity to operationalize a standardized whole-person health score in the learning health system and identify at-risk patients on a large scale, except they are prone to missingness and errors. Ignoring these data quality issues could lead to biased statistical results and incorrect clinical decisions. Validation of EHR data (e.g., through chart reviews) can provide better-quality data, but realistically, only a subset of patients' data can be validated and most protocols do not recover missing data. Using a representative sample of 1000 patients from the EHR at an extensive learning health system (100 of whom could be validated), we propose methods to design, conduct, and analyze statistically efficient and robust studies of the ALI and healthcare utilization. Targeted validation with an enriched protocol allowed us to ensure the quality and promote the completeness of the EHR. Findings from our validation study were incorporated into statistical models, which indicated that worse whole-person health was associated with higher odds of engaging in the healthcare system, adjusting for age.
Speaker
Sarah Lotspeich, Wake Forest University
You have unsaved changes.