McRigor: a statistical method to enhance rigor of metacell partitioning in single-cell data analysis

Jingyi Jessica Li Co-Author
UCLA
 
Pan Liu First Author
 
Pan Liu Presenting Author
 
Tuesday, Aug 5: 8:50 AM - 9:05 AM
1127 
Contributed Papers 
Music City Center 
In single-cell data analysis, addressing sparsity often involves aggregating the profiles of homogeneous single cells into metacells. However, existing metacell partitioning methods lack checks on the homogeneity assumption and may aggregate heterogeneous single cells, potentially biasing downstream analysis and leading to spurious discoveries. To fill this gap, we introduce mcRigor, a statistical method to detect dubious metacells, which are composed of heterogeneous single cells, and optimize the hyperparameter of a metacell partitioning method. The core of mcRigor is a feature-correlation-based statistic that measures the heterogeneity of a metacell, with its null distribution derived from a double permutation scheme. As an optimizer for existing metacell partitioning methods, mcRigor has been shown to improve the reliability of discoveries in single-cell RNA-seq and multiome (RNA+ATAC) data analyses, such as uncovering differential gene co-expression modules, enhancer-gene associations, and gene temporal expression. Moreover, mcRigor enables benchmarking and selection of the most suitable metacell partitioning method with optimized hyperparameters tailored to specific datasets.

Keywords

Metacell partitioning

Single-cell RNA-seq

Single-cell ATAC-seq

Data sparsity

Permutation 

Main Sponsor

Section on Statistics in Genomics and Genetics