13. Machine Learning Model Robustness and Performance Stability in Future Years when Predicting Adverse Events in a Veteran Population and a Diabetic Subpopulation

Conference: Women in Statistics and Data Science 2024
10/17/2024: 11:45 AM - 1:15 PM EDT
Speed 

Description

We developed machine learning models to predict adverse events after Veterans received non-steroidal anti-inflammatory drugs (NSAIDs) during acute care encounters, and we evaluated model robustness in subsequent years as well as within a subpopulation of patients with diabetes mellitus. We collected electronic health record data from a national U.S. Veteran population ≥18 years who presented to an emergency department or urgent care center, were prescribed NSAIDs from 1/1/2017-12/31/2023, and survived longer than 1-day post-encounter. The outcome of interest was any adverse event within 30 days of the visit (acute kidney injury stage 2-3, gastroesophageal reflux disease, gastrointestinal bleed, or allergic reaction). Using 85 clinical patient variables for care delivered in 2017, we built a logistic regression model using LASSO regularization and an extreme gradient boosting (xgboost) model. We tested the 2017 model on data from each subsequent year starting with 2018 encounter data and ending in 2023. We assessed model performance using calibrated slope and area under the receiver operating characteristic curve (AUC). We were also interested in model performance when applied to a subgroup of patients with a history of diabetes. The incidence rates of any adverse event were 4.9% for the entire cohort and 6.3% in the diabetic subgroup. For the 2017 models evaluated on 2023 encounters, LASSO had a calibrated slope 1.020 compared to xgboost 1.040, and AUC was similar for xgboost 0.790 and LASSO 0.789. For the same model and test data in patients with diabetes, xgboost had a calibrated slope 1.020 vs LASSO 0.975, while AUC was similar for LASSO 0.783 and xgboost 0.782. Model performance for years 2018-2022 was similar. The model also performed moderately well over time in the diabetic subgroup, but performance should be reassessed in a non-Veteran population before making wide generalizations about the model predictions.

Keywords

prediction modeling

LASSO regularization

extreme gradient boosting

Veteran population

diabetes mellitus

adverse events 

Presenting Author

Amy Perkins, Vanderbilt University Medical Center

First Author

Amy Perkins, Vanderbilt University Medical Center

CoAuthor(s)

Michael J Ward, Vanderbilt University Medical Center
Jesse Wrenn, Vanderbilt University Medical Center
Robert Winter, Vanderbilt University Medical Center
Chad Dorn, Vanderbilt University Medical Center
Amber Hackstadt, Vanderbilt University Medical Center
Michael E Matheny, Vanderbilt University Medical Center

Target Audience

Mid-Level

Tracks

Knowledge
Women in Statistics and Data Science 2024