High-Dimentional Variable Selection: an Ensemble-based Method

Xiaofeng Wang Co-Author
The Cleveland Clinic Foundation
 
Han Sun First Author
Cleveland Clinic
 
Han Sun Presenting Author
Cleveland Clinic
 
Wednesday, Aug 6: 11:20 AM - 11:35 AM
1952 
Contributed Papers 
Music City Center 
Variable selection in high-dimensional data analysis poses substantial methodological challenges. While numerous penalized variable selection methods and machine learning approaches exist, many demonstrate instability in real-world applications.
We developed a novel ensemble algorithm for variable selection in competing risks modeling and conducting a comprehensive stability analysis of established variable selection methods. Our methd, the Random Approximate Elastic Net (RAEN), offers a stable and generalizable solution for large-p-small-n variable selection in competing risks data. RAEN's flexible framework enables its application across various time-to-event regression models, including competing risks quantile regression and accelerated failure time models. We demonstrate that our computationally-intensive algorithm substantially improves both variable selection accuracy and parameter estimation in a numerical study. We have implemented
RAEN in a user-friendly R package. To demonstrate its practical utility, we apply RAEN to a cancer study.

Keywords

variable selection

high-dimensional

flexible object function 

Main Sponsor

Section on Medical Devices and Diagnostics