Wednesday, Aug 6: 10:30 AM - 12:20 PM
0427
Invited Paper Session
Music City Center
Room: CC-208B
Biomedical imaging
-Omics data
High-dimensional data
Applied
Yes
Main Sponsor
ENAR
Co Sponsors
International Chinese Statistical Association
Section on Statistics in Imaging
Presentations
There is a growing body of literature on knowledge-guided statistical learning methods for analysis of -omics data that can incorporate knowledge of underlying networks derived from functional genomics and functional proteomics. These methods have been shown to improve variable selection and prediction accuracy, and yield more interpretable results. However, these methods typically use graphs extracted from existing databases or rely on subject matter expertise which are known to be incomplete and may contain false edges. To address this gap, we propose a graph-guided Bayesian modeling framework to account for network noise in regression models involving structured high-dimensional predictors. Specifically, we use two sources of network information, including the noisy graph extracted from existing databases and the estimated graph from observed predictors in the dataset at hand, to inform the model for the true underlying network via a latent scale modeling framework. This model is coupled with the Bayesian regression model with structured high-dimensional predictors involving an adaptive structured shrinkage prior. We develop an efficient Markov chain Monte Carlo algorithm for posterior sampling. We demonstrate the advantages of our method over existing methods in simulations, and through analyses of a genomics dataset and another proteomics dataset for Alzheimer's disease.
Keywords
Brain network
Scalar-on-function prediction
MCMC algorithm
The progression of brain Tau pathology has been broadly believed to follow a stereotypical pattern. However, the latest advances in Tau-PET imaging suggests the existence of heterogeneous Tau emergence and progression, which could bias interventions if based solely on the established Tau-targeting routine. Meanwhile, most Tau-PET imaging studies are either cross-sectional or with limited follow-ups, bringing further challenges to uncover personalized Tau progression patterns. In this work, we address these hurdles by proposing an innovative generative network-based diffusion models under cross-sectional observations and heterogeneity. Unlike existing models that rely on repeated measurements to characterize propagation networks, our method can uncover spread network even with sparse observations. Based on extensive simulations and data analyses, we demonstrate the superiority of our method.
Keywords
Network modeling
Generative modeling
Imaging
Single-cell RNA sequencing (scRNA-seq) allows transcriptional profiling, and cell-type annotation of individual cells. However, sample preparation in typical scRNA-seq experiments often homogenizes the samples, thus spatial locations of individual cells are often lost. Although spatial transcriptomic techniques, such as in situ hybridization (ISH) or Slide-seq, can be used to measure gene expression in specific locations in samples, it remains a challenge to measure or infer expression level for every gene at a single-cell resolution in every location in tissues. Existing computational methods show promise in reconstructing these missing data by integrating scRNA-seq data with spatial expression data such as those obtained from spatial transcriptomics. Here we describe Laplacian Linear Optimal Transport (LLOT), an interpretable method to integrate single-cell and spatial transcriptomics data to reconstruct missing information at a whole-genome and single-cell resolution. LLOT iteratively corrects platform effects and employs Laplacian Optimal Transport to decompose each spot in spatial transcriptomics data into a spatially-smooth probabilistic mixture of single cells. We benchmarked LLOT against several other methods on datasets of Drosophila embryo, mouse cerebellum and synthetic datasets generated by scDesign3 in the paper, and another three datasets in the supplementary. The results showed that LLOT consistently outperformed others in reconstructing spatial expressions.
Keywords
Data integration, Linear Map, Laplacian Optimal Transport, Spatial Expressions, scRNA-seq
Many biomedical studies generate data from multiple sources or views with a main goal of integrating these diverse but complementary data for deeper biological insights. Most existing integrative analysis methods only consider associations among the views and an outcome without inferring potential causal relationships. Mediation analysis explores causal relationships between exposures and an outcome through including a mediator as an intermediate variable. Existing mediation analysis methods consider only single variate and single view exposures, and none incorporate multi-view exposures. We propose Multi-view Multivariate Mediation Analysis (MMM), which considers both multivariate exposures and mediators and incorporates multi-view exposures. MMM integrates multi-view exposures by identifying disentangled common drivers accounting for indirect effects via a multivariate mediator, and direct effects to be estimated separately. Simulation studies are used to demonstrate the effectiveness of MMM. MMMis applied to data from the ADNI study to explore underlying mechanisms of Alzheimer's Disease.
Keywords
Multimodal data integration
Causal mediation analysis
high dimensional analysis
variable selection
multiview learning