Print Close

56: Optimal Shrinkage Estimation for Linear Discriminant Analysis in Ultra-high Dimension

Presented During: Contributed Poster Presentations: Section on Statistical Learning and Data Science

Xiucai Ding Co-Author

Wonjun Seo First Author

Wonjun Seo Presenting Author

Tuesday, Aug 5: 10:30 AM - 12:20 PM
1588
Contributed Posters

Music City Center

Linear discriminant analysis (LDA) faces significant challenges when the number of features (p) exceeds the number of observations (n). While various methods have been proposed to address this issue, most assume n and p are comparable or impose restrictive structural assumptions on the population covariance matrix. In this study, we present a unified framework for LDA based on an optimal shrinkage method designed for ultra-high dimensional data, where p grows polynomially in n. As examples within our framework, we consider two types of shrinkage estimators: a linear shrinker, leading to a regularized LDA, and a nonlinear shrinker under the generalized spiked covariance matrix model. Leveraging recent advances in random matrix theory, we establish theoretical guarantees for our approach by analyzing the asymptotic behavior of outlier eigenvalues and eigenvectors, as well as deriving a quantum unique ergodicity estimate for non-outlier eigenvectors of the spiked sample covariance matrix. These results also reveal a phase transition phenomenon in LDA, allowing us to characterize the conditions under which LDA succeeds or fails based on the magnitude of the mean difference.

Keywords

Linear discriminant analysis

Optimal shrinkage estimation

Spiked model

Random matrix theory

Main Sponsor

Section on Statistical Learning and Data Science