Random Forests and Clustering for Identifying Clinical Phenotypes

Barbara Bailey First Author
San Diego State University
 
Barbara Bailey Presenting Author
San Diego State University
 
Tuesday, Aug 6: 9:50 AM - 9:55 AM
3520 
Contributed Speed 
Oregon Convention Center 

Description

Random Forests can be used for classification and clustering. In the supervised Random Forest used for classification, each subject will have a known grouping. In the unsupervised Random Forest used for clustering, the proximity matrix needed for clustering can be estimated. Clustering algorithms use data to form groups of similar subjects that share distinct properties. Phenotypes can be identified using a proximity matrix generated by the unsupervised Random Forests and subsequent clustering by the Partitioning around Medoids (PAM) algorithm.
PAM uses the dissimilarity matrix in its class partitioning or clustering algorithm and is more robust to noise and outliers as compared to the more commonly used k-means algorithm.

We present results that identify distinct phenotypes or groups of subjects that are Hispanic/Latino with chronic low back pain. Data consisted of sensor-based measures of posture and movement, pain behavior, and psychological measures. Groupings may provide a basis for a more personalized plan of care, including pain management strategies that encourage movement and rest periods.

Keywords

random forests

chronic lower back pain 

Main Sponsor

WNAR