Minimum Covariance Determinant: Spectral Embedding and Subset Size Determination

Yichi Zhang Co-Author
North Carolina State University
 
Kenneth Lange Co-Author
Department of Computational Medicine, UCLA
 
Qiang Heng First Author
 
Qiang Heng Presenting Author
 
Thursday, Aug 8: 10:05 AM - 10:20 AM
1954 
Contributed Papers 
Oregon Convention Center 
This paper introduces several ideas to the minimum covariance determinant problem for outlier detection and robust estimation of means and covariances. We leverage the principal component transform to achieve dimension reduction, paving the way for improved analyses. Our best subset selection algorithm strategically combines statistical depth and concentration steps. To ascertain the appropriate subset size and number of principal components, we introduce a novel bootstrap procedure that estimates the instability of the best subset algorithm. The parameter combination exhibiting minimal instability proves ideal for the purposes of outlier detection and robust estimation. Rigorous benchmarking against prominent MCD variants showcases our approach's superior capability in outlier detection and computational speed in high dimensions. Application to a fruit spectra data set and a cancer genomics data set illustrates our claims.

Keywords

Robustness

Outliers

Principal component analysi

Statistical depth

Bootstrap

Algorithm instability 

Main Sponsor

Section on Statistical Computing