How a misclassified binary outcome Y* affects model prediction on the correct Y: a simulation study
Conference: Symposium on Data Science and Statistics (SDSS) 2023
05/24/2023: 4:20 PM - 4:25 PM CDT
Lightning
Statistics models are used for explaining and/or predicting an outcome of interest. For explanations, the focus is on parameter estimation that describes an independent variable's effect. In this regard, the effect of a misclassified outcome variable and how to correct it has been studied extensively, with one popular method being MCSIMEX. However, a relevant question yet to be addressed is how misclassification affects predictive performance. We investigate this through extensive simulation studies. Motivated by a real world example, we generated a binary event status Y that is subject to misclassification. We fit a logistic regression model using the misclassified Y* and assessed model performance on a test data simulated from the same underlying model without misclassification. We show that the predictive performance on test data is similar regardless of whether or not the misclassified Y* was corrected and always better than the performance on the training data.
Misclassification error
Prediction power
Binary response
Simulation study
Presenting Author
Zorina Han, University of Alberta
First Author
Zorina Han, University of Alberta
CoAuthor
Yan Yuan, University of Alberta
Target Audience
Mid-Level
Tracks
Practice and Applications
Symposium on Data Science and Statistics (SDSS) 2023
You have unsaved changes.