56: Optimal Shrinkage Estimation for Linear Discriminant Analysis in Ultra-high Dimension

Xiucai Ding Co-Author
 
Wonjun Seo First Author
 
Wonjun Seo Presenting Author
 
Tuesday, Aug 5: 10:30 AM - 12:20 PM
1588 
Contributed Posters 
Music City Center 
Linear discriminant analysis (LDA) faces significant challenges when the number of features (p) exceeds the number of observations (n). While various methods have been proposed to address this issue, most assume n and p are comparable or impose restrictive structural assumptions on the population covariance matrix. In this study, we present a unified framework for LDA based on an optimal shrinkage method designed for ultra-high dimensional data, where p grows polynomially in n. As examples within our framework, we consider two types of shrinkage estimators: a linear shrinker, leading to a regularized LDA, and a nonlinear shrinker under the generalized spiked covariance matrix model. Leveraging recent advances in random matrix theory, we establish theoretical guarantees for our approach by analyzing the asymptotic behavior of outlier eigenvalues and eigenvectors, as well as deriving a quantum unique ergodicity estimate for non-outlier eigenvectors of the spiked sample covariance matrix. These results also reveal a phase transition phenomenon in LDA, allowing us to characterize the conditions under which LDA succeeds or fails based on the magnitude of the mean difference.

Keywords

Linear discriminant analysis

Optimal shrinkage estimation

Spiked model

Random matrix theory 

Main Sponsor

Section on Statistical Learning and Data Science