Tuesday, Aug 5: 2:00 PM - 3:50 PM
0767
Topic-Contributed Paper Session
Music City Center
Room: CC-105B
Applied
Yes
Main Sponsor
International Indian Statistical Association
Co Sponsors
Section on Statistics in Epidemiology
Section on Statistics in Genomics and Genetics
Presentations
Polygenic risk score (PRS) prediction in non-European populations remains challenging due to limited GWAS sample sizes and individual-level tuning data. While several multi-population PRS methods have been proposed, none consistently achieve optimal performance across diverse data scenarios, particularly when tuning data are unavailable. In this presentation, I will introduce JointPRS, a comprehensive framework that models multiple populations and estimates chromosome-wise cross-population genetic correlation using GWAS summary statistics. JointPRS has robust performance even without individual-level datasets for tuning parameters. When non-European individual-level data are available, we propose a data-adaptive approach combining meta-analysis and tuning strategies and inheriting the merits from both strategies, further enhancing prediction performance and robustness. To address scenarios where no single method dominates (e.g., high causal SNP proportions), I will further discuss MIX, a novel framework that optimally integrates predictions from diverse PRS methods (e.g., JointPRS and SDPRX) using only GWAS summary statistics. MIX employs data fission, a subsampling strategy that partitions GWAS data into pseudo-training and pseudo-testing sets, and incorporates SNP pruning step to mitigate the linkage disequilibrium (LD) mismatch issue. I will present comprehensive evaluations of JointPRS through its application to 26 traits across five populations (European, East Asian, African, South Asian, and Admixed American) in the UK Biobank and All of Us cohorts. Results show that JointPRS outperformed six state-of-the-art PRS methods (SDPRX, XPASS, PRS-CSx, PROSPER, MUSSEL, and BridgePRS) in most scenarios. Furthermore, MIX can effectively integrate different PRS methods, optimizing prediction performance across all settings by leveraging the advantages across methods.
The usual practice for building and applying polygenic risk scores is to first divide populations being studied into ancestrally homogenous subsets, performing genome wide association studies on the subsets and developing polygenic risk scores for each subset. When the scores are applied to assess risk for disease some sort of correction is needed to address heterogeneity in the target population. This approach is disadvantageous because it cannot include individuals of mixed ancestry in the initial risk score modeling, ignores the variability in linkage disequilbrium among populations which can refine the identification of causal variants and cannot be applied directly to individuals of mixed ancestry or who do not align with the assumed homogenous sets. In this presentation, we provide an alternate approach based on optimal application of mixed models that can include related individuals and individuals of mixed ancestry in the polygenic risk model development. We subsequently evaluate polygenic risk score application to populations who are enrolled in clinical trials to assess the impact of precision behavioral medicine in change smoking behavior and adoption of lung cancer screening.
Precision medicine aims to personalize treatments by leveraging patients' molecular markers. Polygenic risk scores (PRSs) have emerged as promising tools for improving drug response prediction and patient stratification, thereby accelerating the advancement of precision medicine. This talk will explore three primary strategies for developing PRSs for precision medicine: (1) by utilizing large-scale genome-wide association studies (GWAS) of related diseases, (2) by leveraging independent pharmacogenomics (PGx) studies of related drug response, (3) by jointly modeling both. Each of these strategies presents unique advantages and disadvantages. This talk critically evaluates these strategies, focusing on their ability of capturing prognostic and predictive effects, predicting drug response, and effectively stratifying patients. We further delve into novel PRS methods we have developed for building predictive PRSs with differential treatment effects, including machine learning, Bayesian, and transfer learning-based approaches. Lastly, practical considerations and statistical insights for developing robust PRSs in the context of randomized clinical trials will be also discussed.
Family-based studies provide a unique opportunity to characterize genetic risks of diseases in the presence of population structure, assortative mating, and indirect genetic effects. We propose a likelihood-based method, PGS-TRI, for the analysis of polygenic scores (PGS) in case-parent trio studies for estimation of the risk of an index condition associated with direct effects of inherited PGS, indirect effects of parental PGS, and gene-environment interactions. We assume the disease risk follows a log-linear model and PGS follows a normal distribution, allowing for family-specific effects in both components to account for arbitrary population structures. We show that likelihood calculations can be simplified into a parental PGS component and a transmission-based likelihood, from which genetic effect estimates can be derived in closed forms. Extensive simulation studies demonstrate the robustness of PGS-TRI in the presence of complex population structure and assortative mating compared to alternative methods. We apply PGS-TRI to multi-ancestry trio studies of autism spectrum disorders and orofacial clefts to establish the first transmission-based estimates of risk associated with pre-defined PGS for these conditions and other related traits. For both conditions, we further explored offspring risk associated with polygenic gene-environment interactions, and direct and indirect effects of genetically predicted levels of gene expression and metabolite traits.