Comparing Feature Selection Methods in Clinical Data Modeling: LASSO and Stepwise Regression

Daniel Einhorn Co-Author
Corcept Therapeutics Inc.
 
Cristina Tudor Co-Author
Corcept Therapeutics Inc.
 
Yumeng Wang First Author
 
Yumeng Wang Presenting Author
 
Wednesday, Aug 6: 2:05 PM - 2:20 PM
1814 
Contributed Papers 
Music City Center 
In clinical data modeling, a common challenge is the high dimensionality of features relative to the number of patients, which complicates reliable inference. Regression models are frequently employed due to their interpretability and ability to quantify parameter estimates and confidence intervals. While stepwise feature selection has historically been popular, recent studies suggest that regularized methods like LASSO, Ridge Regression, and Elastic Net offer superior performance.

This study evaluates and compares the performance of LASSO, Ridge Regression, Elastic Net, and Stepwise Regression using both simulated datasets and a prospective study of endogenous hypercortisolism in a population with difficult to control type 2 diabetes. Key metrics include feature selection overlap, parameter estimates with confidence intervals, and test statistics. Results indicate a significant overlap in features selected by LASSO and Stepwise Regression, with LASSO selecting a more comprehensive and robust feature set. LASSO also outperforms Stepwise Regression in accuracy and robustness, while Stepwise Regression exhibits a higher tendency for overfitting.

Keywords

Lasso

stepwise regression

simulation

hypercortisolism 

Main Sponsor

Section on Statistical Learning and Data Science