Leveraging Machine Learning Methods to Determine the “Omics” Risks of Aging Comorbidities

Theresa Kim Chair
National Institutes of Health, National Institute on Aging
 
Daniel Felsky Discussant
Centre for Addiction and Mental Health
 
Theresa Kim Organizer
National Institutes of Health, National Institute on Aging
 
Jun Young Park Organizer
 
Shuo Chen Organizer
University of Maryland, School of Medicine
 
Tuesday, Aug 5: 2:00 PM - 3:50 PM
0189 
Invited Paper Session 
Music City Center 
Room: CC-207A 

Keywords

Machine Learning

AD/ADRD

MCI

Neuroimaging 

Applied

Yes

Main Sponsor

ENAR

Co Sponsors

Caucus for Women in Statistics
SSC (Statistical Society of Canada)

Presentations

Application of sparse group lasso in Transcriptome-Wide Association Studies to determine pathway-level genetic risk of brain aging

Brain aging involves the gradual loss of structure and function of neurons and their connections, leading to cognitive decline and increased vulnerability to neurodegenerative diseases including Alzheimer's disease (AD). We have conducted a number of studies in UK Biobank (UKB) to identify the nongenetic risk factors (including smoking, blood pressure, allostatic load, diet and life essential 8) of white matter (WM) Brain Age Gap (BAG), a marker of brain aging predicted from multiple fractional anisotropy tract measurements obtained from diffusion tensor imaging data using machine learning algorithm. However, little is known about the genetic risk of brain aging. Genome-wide association studies (GWAS) only identify association and risk at SNP level, on the other hand, transcriptome-wide association studies (TWAS) methods integrate GWAS data and expression reference panels (e.g. expression QTL) to identify the associations at gene level potentially improving the interpretability. Existing TWAS methods, however, are dominantly univariate, the genes identified have little unifying biological themes thus limiting in its application to determine the genetic risk of complex polygenic trait like brain aging. We developed a novel pathway-guided TWAS method and embedded sparse group lasso within the framework to select genes and pathways most associated with brain aging using imaging and genetic data from UKB. We incorporated curated pathway databases including KEGG, Reactome and Biocarta and identified five major categories of pathways related to neural system, DNA repair, DNA metabolism, protein metabolism and immune defense most associated with WM BAG which cannot be found by existing TWAS methods. Our findings provide new insights into the genetics of brain aging and improve our understanding of the molecular mechanism of the aging brain and the transition to AD.  

Keywords

transcriptome-wide association studies (TWAS)

white matter brain aging

pathway

sparse group lasso 

Speaker

Tianzhou Ma, University of Maryland, College Park

Enhancing Predicted Gene Expression Models of Alzheimer's Disease Leveraging Single Cell Datasets

The genetic architecture of Alzheimer's disease and related dementias (ADRD) is complex and polygenic. The advent of machine learning models that leverage expression quantitative trait loci (eQTL) databases to inform genomic prediction (e.g., PrediXcan) have dramatically improved the statistical power and biological interpretation in genomic studies of ADRD. We build on these methods by improving the cellular resolution leveraging single nucleus RNA sequencing datasets and deep quantitative traits harmonized as part of the AD sequencing project phenotype harmonization consortium (ADSP-PHC). First, we demonstrate the accuracy of predicted expression models for numerous cell types by validating model builds in an independent dataset, focusing on predicted gene expression models from excitatory neurons for an example. Next, we demonstrate novel associations with amyloid, tau, and cognitive decline leveraging these novel prediction models. Finally, we characterize our top gene candidate by exploring associations with observed expression at the bulk and single cell level to demonstrate the power of well-informed models of gene expression at single cell resolution. 

Speaker

Timothy Hohman, Vanderbilt University Medical Center

Collaborative Quantile Treatment Effect Estimation for Distributed Alzheimer's Research Data

Our recent analyses of the NACC data have revealed substantial response-dependent heterogeneity in the causal effects of repurposed drugs (e.g., metformin), social indicators (e.g., living alone), and genetic factors (e.g., APOE genotype) in AD patients. This heterogeneity in response highlights the need for quantile treatment effect (QTE) estimation, which captures how treatment effects vary across the distribution of clinical outcomes—beyond what average treatment effect (ATE) approaches can reveal.
However, estimating QTEs in modern AD research presents significant methodological and computational challenges. Large-scale observational, biomarker, and neuroimaging datasets are often distributed across sites, with privacy constraints and limited data-sharing infrastructure preventing centralized analysis.

We introduce SCQTE, a sequential collaborative method for scalable and privacy-preserving QTE estimation across distributed data environments. SCQTE accommodates both conditional and unconditional QTEs, requires only one or two rounds of inter-site communication, and achieves estimation accuracy equivalent to oracle estimators using pooled data. This work provides a practical solution for collaborative causal inference in aging and dementia research at scale.
 

Speaker

Nan Lin, Washington University in St. Louis

Mapping Individual Differences in Intermodal Coupling in Neurodevelopment

Within-individual coupling between measures of brain structure and function evolves in development and may underlie differential risk for neuropsychiatric disorders. Despite increasing interest in the development of structure-function relationships, rigorous methods to quantify and test individual differences in coupling remain nascent. In this article, we explore and address gaps in approaches for testing and spatially localizing individual differences in intermodal coupling. We propose a new method, called CIDeR, which is designed to simultaneously perform hypothesis testing in a way that limits false positive results and improve detection of true positive results. Through a comparison across different approaches to testing individual differences in intermodal coupling, we delineate subtle differences in the hypotheses they test, which may ultimately lead researchers to arrive at different results. Finally, we illustrate the utility of CIDeR in two applications to brain development using data from the Philadelphia Neurodevelopmental Cohort. 

Keywords

Neuroimaging

Alzheimer's disease

Spatial statistics

Latent factor model 

Speaker

Jun Young Park