06: Addressing Missing Data in SDOH: Imputation and Translation

Semhar Michael Co-Author
South Dakota State University
 
Hossein Moradi Rekabdarkolaee Co-Author
South Dakota State University
 
Mary Row First Author
 
Mary Row Presenting Author
 
Wednesday, Aug 6: 10:30 AM - 12:20 PM
2376 
Contributed Posters 
Music City Center 
The presence of missing data in social determinants of health (SDOH) can hinder the effectiveness of statistical models aimed at understanding and addressing health disparities. This project focuses on testing and implementing different methods for imputing SDOH data that is missing at random as well as translating SDOH data that is missing by design. Different approaches including Bayesian regression, linear regression, and predictive mean matching using the r-package MICE (multiple imputations for chained equations) were tested and evaluated on a training dataset. Each method was evaluated using root mean squared error (RMSE), correlation between the imputed and actual values, mean absolute percentage error (MAPE), and computation time. In terms of RMSE and correlation, no model consistently showed any significant advantage over the others. In terms of MAPE, the models using predictive mean matching were consistently better than those using Bayesian and linear regression. In terms of computation time, the Bayesian approach was the fastest, but was not significantly faster than the linear regression, and the predictive mean matching method took the longest.

Main Sponsor

Government Statistics Section