Generalized Estimating Equation for Modeling Cell-Cell Correlation in Single-Cell RNA Seq Data
Tuo Lin
Co-Author
University of Florida
Toni Gui
Co-Author
University of Florida
Xin Tu
Co-Author
University of California San Diego
Monday, Aug 4: 11:15 AM - 11:35 AM
Invited Paper Session
Music City Center
For analyzing the single-cell RNA sequencing (scRNA-seq) data, it is believed that cells from the same individual share common genetic and environmental backgrounds and are not statistically independent. Many popularly used methods, such as the default wilcox test in FindMarkers function in the Seurat package do not address this dependence issue, leading to potentially highly inflated type 1 error rates. There are more recent works arguing for the use generalized linear mixed models with a random effect for individual, to properly account for the correlation structure among measures from cells within an individual. However, traditional mixed effect model has strong assumptions that require the same and strictly positive correlation across all cells in the same individual. We demonstrate that this can be rather restrictive for real data we see, given the heterogeneous nature of all cells in the same subject. In case of positive correlation assumption violated, classical random effects model demonstrates consistently biased inference and inflated type I error in differential expression analysis we investigated. We propose to use the generalized estimating equation based semi-parametric approach for this issue and demonstrate its robust and efficient performance in both simulation and real data that focuses on revealing common and unique gene expression signatures in primary CD4+ T cells latently infected with HIV under different conditions.
HIV latency
single cell RNA seq
You have unsaved changes.