Variable Weighted Random Forest for Two-Phase Case-Control Studies

Yunbi Nam Co-Author
Vanderbilt University
 
Youyi Fong Co-Author
Fred Hutchinson Cancer Research Center
 
Sunwoo Han First Author
University of Miami
 
Sunwoo Han Presenting Author
University of Miami
 
Thursday, Aug 7: 10:05 AM - 10:20 AM
1246 
Contributed Papers 
Music City Center 
Variable weighted random forest (vwRF) is a variant version of RF by assigning different weights to feature sampling at each node of trees during the model construction. The vwRF has shown a successful prediction performance as a feature selection method in low-signal-to-noise problems. However, it has not been studied with datasets from two-phase case-control studies that suffer from low-signal-to-noise and class imbalanced problems simultaneously. In this talk, we introduce a novel weighting strategy to vwRF for improving prediction in two-phase sampling designs facing these problems. For the weights, we adopted RF permutation variable importance combined with area under the precision-recall curve and the receiver operating characteristic curve. We demonstrated the improved prediction of our proposed methods through simulation studies. We also illustrated the use of our methods using a real example of an immunologic biomarker dataset from RV144 phase 3 HIV vaccine efficacy trial.

Keywords

Variable weighted random forest

Two-phase case-control study

HIV 

Main Sponsor

ENAR