Print Close

Penalized Principal Component Analysis Using Nesterov Smoothing

Presented During: Big Data & The Curse of Dimensionality

Georg Hahn Co-Author

Rebecca Hurwitz First Author
Harvard University

Rebecca Hurwitz Presenting Author
Harvard University

Tuesday, Aug 5: 9:35 AM - 9:50 AM
1691
Contributed Papers

Music City Center

Principal components computed via PCA are traditionally used to reduce dimensionality in genomic data or correct for population stratification. In this statistical paper, we explore the penalized eigenvalue problem (PEP), which reformulates the first eigenvector computation as an optimization problem, adding an L1 penalty to enforce sparsity. In our threefold contribution, we first extend PEP by applying Nesterov smoothing to the LASSO-type L1 penalty, enabling analytical gradient computation for faster, more efficient objective function minimization. Second, we illustrate how higher order eigenvectors can be computed with PEP using established SVD results. Third, we present experimental studies exhibiting the utility of smoothed penalized eigenvectors compared to other state-of-the-art methods. Using 1000 Genomes Project data, we empirically show that our smoothed PEP improves numerical stability and yields meaningful eigenvectors. We employ the PEP approach in further real data applications (polygenic risk score computation and clustering), demonstrating that exchanging the penalized eigenvectors for smoothed counterparts enhances prediction accuracy and cluster discernibility.

Keywords

Principal Component Analysis

Eigenvector

Smoothing

Genomic Relationship Matrix

Singular Value Decomposition

Nesterov

Main Sponsor

Section on Statistical Computing