Consensus Dimension Reduction via Data Integration

Tiffany Tang Co-Author
University of Notre Dame
 
Bingxue An First Author
University of Notre Dame
 
Bingxue An Presenting Author
University of Notre Dame
 
Monday, Aug 4: 11:05 AM - 11:10 AM
1501 
Contributed Speed 
Music City Center 
A plethora of dimension reduction methods have been developed to visualize high-dimensional data in low dimensions. However, different dimension reduction methods often output different visualizations, and many challenges make it difficult for researchers to determine which visualization is best. We thus propose a novel consensus dimension reduction framework, which summarizes multiple visualizations into a single "consensus" visualization. Here, we leverage ideas from data integration (or data fusion) to identify the patterns that are most stable or shared across the many different dimension reduction visualizations and subsequently visualize this shared structure in a single low-dimensional plot. We demonstrate that this consensus visualization effectively identifies and preserves the shared low-dimensional data structure through extensive simulations and real-world case studies. We further highlight our method's robustness to the choice of dimension reduction method and/or hyperparameters --- a highly desirable property when working towards trustworthy and reproducible data science.

Keywords

dimension reduction

data integration

data visualization 

Main Sponsor

Section on Statistical Learning and Data Science