Harmonizing Heterogeneous Single-cell Gene Expression Data with Individual-level Covariates
Vivian Li
Co-Author
University of California, Riverside
Tuesday, Aug 5: 9:20 AM - 9:35 AM
2583
Contributed Papers
Music City Center
The growing availability of single-cell RNA sequencing (scRNA-seq) data high-
lights the necessity for robust integration methods to uncover both shared and unique cellular
features across samples. These datasets often exhibit technical variations and biological dif-
ferences, complicating integrative analyses. While numerous integration methods have been
proposed, many fail to account for individual-level covariates or are limited to discrete vari-
ables. To address these limitations, we propose scINSIGHT2, a generalized linear latent
variable model that accommodates both continuous covariates, such as age, and discrete fac-
tors, such as disease conditions. Through both simulation studies and real-data applications,
we demonstrate that scINSIGHT2 accurately harmonizes scRNA-seq datasets, whether from
single or multiple sources. These results highlight scINSIGHT2's utility in capturing meaningful
biological insights from scRNA-seq data while accounting for individual-level variation.
single-cell RNA-seq
integration
generalized linear latent variable model
Main Sponsor
Section on Statistics in Genomics and Genetics
You have unsaved changes.