Spatial Transcriptomics and Deconvolution Methods

Kai Wang Chair
University of Iowa
 
Tuesday, Aug 5: 2:00 PM - 3:50 PM
4137 
Contributed Papers 
Music City Center 
Room: CC-207B 

Main Sponsor

Section on Statistics in Genomics and Genetics

Presentations

A unified statistical model to detect cell-type-specific spatially variable genes

One of the major challenges in spatial transcriptomics is to detect spatially variable genes (SVGs), whose expression patterns are non-random across tissue locations. Many SVGs correlate with cell type compositions, introducing the concept of cell type-specific SVGs (ctSVGs). Existing ctSVG detection methods treat cell type-specific spatial effects as fixed effects, leading to tissue spatial rotation-dependent results. Moreover, SVGs may exhibit random spatial patterns within cell types, meaning an SVG is not always a ctSVG, and vice versa, further complicating detection. We propose STANCE, a unified statistical model for both SVGs and ctSVGs detection under a linear mixed-effect model framework that integrates gene expression, spatial location, and cell type composition information. STANCE ensures tissue rotation-invariant results, with a two-stage approach: initial SVG/ctSVG detection followed by ctSVG-specific testing. We demonstrate its performance through extensive simulations and analyses of public datasets. Downstream analyses reveal STANCE's potential in spatial transcriptomics analysis. 

Keywords

spatially variable genes

cell-type-specific spatially variable genes

spatial transcriptomics

spatial domain detection 

Co-Author(s)

Bin Chen, Michigan State University
Yuehua Cui, Michigan State University
Yuesong Wu, Michigan State University

First Author

Haohao Su, Michigan State University

Presenting Author

Haohao Su, Michigan State University

Spatial GEE for identifying differentially expressed genes in spatial transcriptomics

Spatial transcriptomics (ST) provides unprecedented insights into gene expression patterns while retaining spatial context, making it valuable for understanding complex tissue architectures like cancers. Seurat, the most popular ST analysis tool, uses the Wilcoxon rank-sum test by default for differential expression (DE) analysis. However, as a nonparametric method that disregards spatial correlations, the Wilcoxon test can lead to inflated false positive rates and misleading findings, highlighting the need for a more robust statistical approach.

We propose a Generalized Score Test (GST) in the Generalized Estimating Equations (GEE) framework as a robust solution for DE analysis in ST. By appropriately accounting for spatial correlations, extensive simulations showed that the GEE GST demonstrated superior Type I error control and comparable power relative to the Wilcoxon test and the GEE robust Wald test. Applications to ST datasets from breast and prostate cancer revealed that the GST-identified DE genes were predominantly enriched in pathways directly implicated in cancer progression, while the Wilcoxon test produced substantial false positives. 

Keywords

Differential expression

GEE

Generalized score test

Spatial transcriptomics

Wilcoxon rank-sum test

Type I error 

Co-Author(s)

Chenxuan Zang, Department of Biostatistics, The University of Texas MD Anderson Cancer Center
Ziyi Li, MD Anderson Cancer Center
Charles Guo, Department of Pathology, The University of Texas MD Anderson Cancer Center
Dejian Lai, University of Texas, Health Science Center At Houston
Peng Wei, University of Texas, MD Anderson Cancer Center

First Author

Yishan Wang

Presenting Author

Yishan Wang

A novel spatially informed reference-free deconvolution method for spatial transcriptomics

Cell-type deconvolution methods has been a driving force for rapid development of spatial transcriptomics (ST) technologies in the past few years. Though reference-based deconvolution methods have been extensively studied, there is still a large demand for methodology development with reference-free deconvolution. STdeconvolve is one of the earliest ref-free deconvolution methods. However, it does not take spatial information into account, limiting its practical utility. Here we introduce a reference-free approach called SpatialDC for spatially informed cell-type deconvolution for ST. In our model, we encourage spatially close spots share similar cell types, leading to improved spatial deconvolution results. We evaluate our model on both simulated and real datasets generated from various ST technologies, including manually annotated dataset (MOB), 10X Visium, and DBiT-seq. The SpatialDC framework demonstrates robust performance in recovering accurate cell-type proportions and transcriptional profiles while effectively accounting for spatial correlations between pixels. This work presents statistical and computational advancements for analyzing complex spatial gene expression data. 

Keywords

Spatial transcriptomics

Deconvolution

Reference-free

Latent Dirichlet Allocation (LDA) 

Co-Author

Yuehua Cui, Michigan State University

First Author

Phuong Vo, Michigan State University

Presenting Author

Phuong Vo, Michigan State University

Comparison of methods for cell type deconvolution in spatially resolved transcriptomic data

Recent technological advancements have made it possible to perform spatially resolved transcriptomic (SRT) profiling, which enhances our understanding of cell-cell communication within the context of tissues. However, current techniques require a compromise between experimental throughput and spatial resolution. Sequencing based technologies prioritize higher experimental throughput, resulting in multicellular pixel data. These datasets necessitate innovative computational methods to deconvolute cell types and avoid potential confounding issues within each pixel. Topic modeling methods, such as Latent Dirichlet Allocation (LDA), spatial LDA, and other statistical frameworks, provide a way to identify cell type composition from multicellular pixels. In this study, we evaluate several deconvolution approaches, assessing their effectiveness in capturing cell type distribution per pixel and gene expression distribution per cell type. Our analysis highlights the strengths and limitations of existing methods, offering guidance on best practices for analyzing multicellular pixel SRT data. 

Keywords

Spatially resolved transcriptomic data

Multicellular pixel data

Cell type deconvolution

Topic mode

Latent Dirichlet Allocation

Cell-cell communication 

Co-Author

Yuan Wang, Washington State University

First Author

Wooyoung Kim, Washington State University

Presenting Author

Wooyoung Kim, Washington State University

Partitioning the Full Transcriptome Profile Within and Beyond Cells in Spatial Transcriptomics

Single-cell RNA sequencing (scRNA-seq) has advanced our understanding of biological systems, yet it fails to capture crucial components of the tissue transcriptome, such as neurite-localized transcripts and extracellular RNA. Spatial transcriptomics (ST) technologies offer an alternative by capturing transcript locations without tissue dissociation. However, existing approaches—such as cell type deconvolution and cell segmentation—primarily aim to recover single-cell-level information, overlooking the residual transcriptome: mRNAs that are either not captured by scRNA-seq or not assigned to any segmented cells in ST data. To address these limitations, we introduce RESCUE, a novel statistical framework that fully partitions gene expression data into contributions from known reference factors and the residual transcriptome. We formulate the problem as a penalized robust regression with a sparse mean-shift parameterization. To account for gene-specific variability, we employ iteratively reweighted adaptive Lasso-type weights. An efficient simulation-based surrogate matching pursuit algorithm is developed for the tuning procedure. Our results demonstrate that RESCUE outperforms existing methods in accurately decomposing ST data and recovers biologically meaningful signals that were previously overlooked. By fully leveraging the unbiased nature of ST data, RESCUE provides a more comprehensive view of transcriptomic organization both within and beyond cell bodies. 

Keywords

Spatial transcriptomics

Single-cell RNA sequencing

Sparse recovery

Robust estimation

Regularized multivariate regression 

Co-Author(s)

Seokjin Yeo, University of Illinois at Urbana-Champaign
Alex Schrader, University of Illinois at Urbana-Champaign
Ian Traniello, Princeton University
Amy Cash Ahmed, University of Illinois at Urbana-Champaign
Gene Robinson, University of Illinois at Urbana-Champaign
Hee-Sun Han, University of Illinois at Urbana-Champaign
Sihai Dave Zhao, University of Illinois at Urbana-Champaign

First Author

Young Joo Lee, Department of Statistics, University of Illinois at Urbana-Champaign

Presenting Author

Young Joo Lee, Department of Statistics, University of Illinois at Urbana-Champaign

A unified framework for deconvolution-based clustering

We propose a novel statistical framework for simultaneously clustering and deconvoluting spatially resolved transcriptomic (SRT) data. Specifically, we propose an estimation criterion that can identify clusters of spatial spots, while also providing estimates of the cell-type compositions for each cluster. Our approach formulates the clustering problem as a well-posed optimization, minimizing the proposed criterion that incorporates spatial structure and cellular heterogeneity. This is solved efficiently using a block coordinate descent algorithm, where each subproblem is convex. To ensure robust and data-driven model selection, we introduce a new strategy for parameter tuning, alongside a novel post-clustering inference framework. This framework addresses challenges like inflated Type I error rates, enabling valid hypothesis testing on the identified regions, providing a statistically rigorous basis for downstream analysis. Extensive simulation studies and real data applications demonstrate that our method significantly outperforms existing competitors, offering a scalable, interpretable, and reliable tool for analyzing complex SRT data. 

Keywords

Clustering

Deconvolution

Spatial Transcriptomics

Post clustering inference

Optimization 

Co-Author

Aaron Molstad, University of Minnesota

First Author

Hyun Jung Koo, School of Statistics, University of Minnesota - Twin Cities

Presenting Author

Hyun Jung Koo, School of Statistics, University of Minnesota - Twin Cities