A Bayesian approach to model uncertainty in unsupervised learning from single-cell genomic data

Thomas E. Bartlett Co-Author
University College London
 
Lina Gerontogianni Co-Author
The Francis Crick Institute
 
Swati Chandna Co-Author
Birkbeck, University of London
 
Shanshan Ren First Author
University College London
 
Shanshan Ren Presenting Author
University College London
 
Thursday, Aug 7: 10:35 AM - 10:50 AM
1504 
Contributed Papers 
Music City Center 

Description

Network models provide a powerful framework for analysing single-cell count data, facilitating the characterisation of cellular identities, disease mechanisms, and developmental trajectories. However, uncertainty modeling in unsupervised learning with genomic data remains insufficiently explored. Conventional clustering methods assign a singular identity to each cell, potentially obscuring transitional states during differentiation or transformation. This study introduces a variational Bayesian framework for clustering and analysing single-cell genomic data, employing a Bayesian Gaussian mixture model to estimate the probabilistic association of cells with distinct clusters. This approach captures cellular transitions, yielding biologically coherent insights into neurogenesis and breast cancer progression. The inferred clustering probabilities enable further analyses, including Differential Expression Analysis, Gene Set Enrichment Analysis, and pseudotime analysis. Furthermore, we develop a novel quantitative measure to validate unsupervised learning with scRNA-seq data, reflecting a more authentic correspondence between clustering outcomes and marker genes. This methodological advancement enhances the resolution of single-cell data analysis, enabling a more nuanced characterisation of dynamic cellular identities in development and disease.

Keywords

Unsupervised learning

Variational Bayesian Estimation of a Gaussian Mixture

Pseudotime analysis

Dimensionality reduction

Single-cell genomics

Embryo cortical development and Breast cancer progression 

Main Sponsor

Royal Statistical Society