Contrastive dimension estimation

Didong Li Co-Author
 
Sam Hawke First Author
 
Sam Hawke Presenting Author
 
Wednesday, Aug 7: 9:00 AM - 9:05 AM
2535 
Contributed Speed 
Oregon Convention Center 
Contrastive dimension reduction methods have been used to uncover the low-dimensional structure that distinguishes one dataset (foreground) from another (background). However, current contrastive dimension reduction techniques do not estimate the number of unique dimensions, denoted as d_c, within the foreground data. Instead, they require this quantity as an input and proceed to estimate the dimensions themselves. In this paper, we formally define the contrastive dimension, d_c, and present what we believe to be the first estimator for this parameter. Under a linear model, we demonstrate the consistency of this estimator, establish a finite-sample error bound, and develop a hypothesis test for d_c = 0. This test is valuable for determining the suitability of a contrastive method for a given dataset. Furthermore, we provide a detailed analysis of our findings, supported by simulations using both synthetic and real-world datasets.

Keywords

Dimension reduction

Contrastive dimension 

Main Sponsor

Section on Statistical Learning and Data Science