Single-Cell Embedding from Language Models
Monday, Aug 4: 2:45 PM - 3:05 PM
Invited Paper Session
Music City Center
Various Foundation Models (FMs) have been built based on the pre-training and fine-tuning framework to analyze single-cell data with different degrees of success. In this presentation, we describe a method named scELMo (Single-cell Embedding from Language Models) to analyze single-cell data that utilizes Large Language Models (LLMs) as a generator for both the description of metadata information and the embeddings for such descriptions. We combine the embeddings from LLMs with the raw data under the zero-shot learning framework to further extend its function by using the fine-tuning framework to handle different tasks. We demonstrate that scELMo is capable of cell clustering, batch effect correction, and cell-type annotation without training a new model. Moreover, the fine-tuning framework of scELMo can help with more challenging tasks, including in-silico treatment analysis and modeling perturbation. scELMo has a lighter structure and lower resource requirements. Moreover, our method is comparable to recent large-scale FMs (such as scGPT and Geneformer) based on our evaluations, suggesting a promising path for developing domain-specific FMs.
Single cell
foundation models
large language models
embedding
clustering
annotation
You have unsaved changes.