Robust Weighted Random Forest with Imbalanced Classification Problems

Yunbi Nam Co-Author
 
Sunwoo Han First Author
University of Miami
 
Sunwoo Han Presenting Author
University of Miami
 
Sunday, Aug 4: 3:25 PM - 3:30 PM
2800 
Contributed Speed 
Oregon Convention Center 
In many applications, it is common to have numerous features with different levels of information and an imbalanced outcome ratio simultaneously. Weighted Random Forest (WRF) has been utilized to address low-signal-to-noise problem by assigning more weights to informative features prioritizing the inclusion of a feature subset at each node of individual trees. However, it has not been actively studied in class imbalanced problem. In this work, we propose to use RF variable importance in the area under the receiver operating characteristic curve - referred to VI-AUC - as weights with WRF to account for class imbalanced problems. Our simulation studies show that WRF with VI-AUC is superior and stable compared to other weighting methods, particularly in class imbalanced scenarios with small sample size. Applications using an immunologic marker dataset from an HIV vaccine efficacy trial are illustrated.

Keywords

Variable importance

Weighted random forest

Class imbalance

AUC 

Main Sponsor

ENAR

Co Sponsors

ENAR