Print Close

06: Addressing Missing Data in SDOH: Imputation and Translation

Presented During: Contributed Poster Presentations: Government Statistics Section

Semhar Michael Co-Author
South Dakota State University

Hossein Moradi Rekabdarkolaee Co-Author
South Dakota State University

Mary Row First Author

Mary Row Presenting Author

Wednesday, Aug 6: 10:30 AM - 12:20 PM
2376
Contributed Posters

Music City Center

The presence of missing data in social determinants of health (SDOH) can hinder the effectiveness of statistical models aimed at understanding and addressing health disparities. This project focuses on testing and implementing different methods for imputing SDOH data that is missing at random as well as translating SDOH data that is missing by design. Different approaches including Bayesian regression, linear regression, and predictive mean matching using the r-package MICE (multiple imputations for chained equations) were tested and evaluated on a training dataset. Each method was evaluated using root mean squared error (RMSE), correlation between the imputed and actual values, mean absolute percentage error (MAPE), and computation time. In terms of RMSE and correlation, no model consistently showed any significant advantage over the others. In terms of MAPE, the models using predictive mean matching were consistently better than those using Bayesian and linear regression. In terms of computation time, the Bayesian approach was the fastest, but was not significantly faster than the linear regression, and the predictive mean matching method took the longest.

Main Sponsor

Government Statistics Section