46: Extending Sparse CCA for Multi-Population, Multi-Feature Integration
Quefeng Li
Co-Author
University of North Carolina Chapel Hill
Tuesday, Aug 5: 2:00 PM - 3:50 PM
1922
Contributed Posters
Music City Center
Sparse canonical correlation analysis (SCCA) identifies sparse linear combinations between two sets of features that are highly correlated with each other. While multiple SCCA methods extend this framework to more than two datasets, they assume measurements of different features within the same population. Here, we propose an extension of SCCA designed for settings with four data matrices derived from two distinct populations, each with two different feature sets. The correlation maximization problem is reframed as a minimization problem and the original canonical weights are decomposed into two separate components that capture the shared and unique variance for each dataset. Via simulations, we demonstrate the improved performance of our method to recover the true canonical weights in comparison to naïve methods that disregard either the shared or unique components. For real data analysis, we apply our method to integrate two single-cell multiomic datasets of peripheral blood mononuclear cells with simultaneous measures of both RNA expression and chromatin accessibility, benchmarking its performance against widely used single-cell integration pipelines such as Seurat and Signac.
Sparse Canonical Correlation Analysis
Data Integration
Variance Decomposition
Single-Cell Multiomics
Main Sponsor
Section on Statistics in Genomics and Genetics
You have unsaved changes.