06: Addressing Missing Data in SDOH: Imputation and Translation
Wednesday, Aug 6: 10:30 AM - 12:20 PM
2376
Contributed Posters
Music City Center
The presence of missing data in social determinants of health (SDOH) can hinder the effectiveness of statistical models aimed at understanding and addressing health disparities. This project focuses on testing and implementing different methods for imputing SDOH data that is missing at random as well as translating SDOH data that is missing by design. Different approaches including Bayesian regression, linear regression, and predictive mean matching using the r-package MICE (multiple imputations for chained equations) were tested and evaluated on a training dataset. Each method was evaluated using root mean squared error (RMSE), correlation between the imputed and actual values, mean absolute percentage error (MAPE), and computation time. In terms of RMSE and correlation, no model consistently showed any significant advantage over the others. In terms of MAPE, the models using predictive mean matching were consistently better than those using Bayesian and linear regression. In terms of computation time, the Bayesian approach was the fastest, but was not significantly faster than the linear regression, and the predictive mean matching method took the longest.
Main Sponsor
Government Statistics Section
You have unsaved changes.