34: The Use of Multiple Imputation for Missing Data in A Health-Related Study
Bin Ge
First Author
University of Missouri-Columbia
Bin Ge
Presenting Author
University of Missouri-Columbia
Tuesday, Aug 5: 10:30 AM - 12:20 PM
1135
Contributed Posters
Music City Center
Multiple imputation of missing data has been an active area of statistics research before the big data era. In this project, we study the use of multiple imputation approach to a health-related data set with eight identified variables with data missing rates from 0 to 16%. We conducted multiple imputations (simple random) on this data set.
Furthermore, to investigate the use of multiple imputation in a variety of missing data structures and missing data rates, we generated incomplete data sets from the complete data set obtained from the health-related data. The generated incomplete data sets were analyzed with logistic regression by using multiple imputation to handle missing data. The results of regression analysis on those incomplete data sets were compared with the one obtained from analysis of complete data set. Our results suggest that estimation using five imputations is similar to those using 100 imputations with the logistic regression analysis. Our results indicate that the missing data has substantial
influence on coefficients, odds ratios, and p-values in logistic regression analysis, especially when the missing rate is high. In such cases, even with multiple imputati
Missing data
multiple imputation
simulation study
logistic regression
Health-related study
Main Sponsor
Section on Statistical Computing
You have unsaved changes.