Penalized Principal Component Analysis Using Nesterov Smoothing

Georg Hahn Co-Author
 
Rebecca Hurwitz First Author
Harvard University
 
Rebecca Hurwitz Presenting Author
Harvard University
 
Tuesday, Aug 5: 9:35 AM - 9:50 AM
1691 
Contributed Papers 
Music City Center 
Principal components computed via PCA are traditionally used to reduce dimensionality in genomic data or correct for population stratification. In this statistical paper, we explore the penalized eigenvalue problem (PEP), which reformulates the first eigenvector computation as an optimization problem, adding an L1 penalty to enforce sparsity. In our threefold contribution, we first extend PEP by applying Nesterov smoothing to the LASSO-type L1 penalty, enabling analytical gradient computation for faster, more efficient objective function minimization. Second, we illustrate how higher order eigenvectors can be computed with PEP using established SVD results. Third, we present experimental studies exhibiting the utility of smoothed penalized eigenvectors compared to other state-of-the-art methods. Using 1000 Genomes Project data, we empirically show that our smoothed PEP improves numerical stability and yields meaningful eigenvectors. We employ the PEP approach in further real data applications (polygenic risk score computation and clustering), demonstrating that exchanging the penalized eigenvectors for smoothed counterparts enhances prediction accuracy and cluster discernibility.

Keywords

Principal Component Analysis

Eigenvector

Smoothing

Genomic Relationship Matrix

Singular Value Decomposition

Nesterov 

Main Sponsor

Section on Statistical Computing