Thursday, Aug 7: 8:30 AM - 10:20 AM
0675
Topic-Contributed Paper Session
Music City Center
Room: CC-214
Applied
Yes
Main Sponsor
Biopharmaceutical Section
Co Sponsors
Biometrics Section
Health Policy Statistics Section
Presentations
Assessing heterogeneity in the effects of treatments has become increasingly popular in the field of causal inference and carries important implications for clinical decision-making. While extensive literature exists for studying treatment effect heterogeneity when outcomes are fully observed, there has been limited development in tools for estimating heterogeneous causal effects when patient-centered outcomes are truncated by a terminal event, such as death. Due to mortality occurring during study follow-up, the outcomes of interest are unobservable, undefined, or not fully observed for many participants in which case principal stratification is an appealing framework to draw valid causal conclusions. Motivated by the Acute Respiratory Distress Syndrome Network (ARDSNetwork) ARDS respiratory management (ARMA) trial, we developed a flexible Bayesian machine learning approach to estimate the average causal effect and heterogeneous causal effects among the always-survivors stratum when clinical outcomes are subject to truncation. We adopted Bayesian additive regression trees (BART) to flexibly specify separate mean models for the potential outcomes and latent stratum membership. In the analysis of the ARMA trial, we found that the low tidal volume treatment had an overall benefit for participants sustaining acute lung injuries on the outcome of time to returning home but substantial heterogeneity in treatment effects among the always-survivors, driven most strongly by biologic sex and the alveolar-arterial oxygen gradient at baseline (a physiologic measure of lung function and degree of hypoxemia). These findings illustrate how the proposed methodology could guide the prognostic enrichment of future trials in the field.
Keywords
causal inference
heterogeneity of treatment effects
intercurrent events
principal stratification
truncation by death
acute lung injury
Speaker
Fan Li, Yale School of Public Health
To support the expedited drug development that addresses unmet medical needs in heterogenous patient population, seamless phase 2/3 design that makes the phase switching decision based on an early surrogate endpoint is gaining more popularity in practice. For also catering to potentially more beneficial patient subgroups based on predictive biomarkers, it is appealing to incorporate the subgroup enrichment feature into the seamless phase 2/3 design. However, the sample size planning for such a complex adaptive design is challenging, as it must strike a balance among shortening development timeline, mitigating development risks, and accounting for uncertainty related to subgroup effects. To fill this gap, we propose a flexible seamless phase 2/3 design framework with population selection and sample size re-estimation using a surrogate endpoint. We elucidate the patterns of the overall type I error for the proposed adaptive design and propose an easy-to-implement approach to control the overall type I error. Extensive simulation studies are conducted to demonstrate the advantages of our proposal design compared to the fixed-sample design in terms of efficiency, power, and timeline saving.
Keywords
Adaptive design
Subgroup enrichment
Heterogenous treatment effect
Speaker
Liwen Wu, Takeda Pharmaceuticals
Health-care policy makers are often interested in the cost-effectiveness of an intervention. The effectiveness is usually measured by quality adjusted life years, which is subject to informative censoring, and the costs, both of which are often assessed from large-scale observational studies and databases (e.g., claims data, large cohort studies) and are thus susceptible to confounding. There is considerably rich literature available to accommodate censoring and adjust for confounding factors. However, most cost-effectiveness studies are primarily concerned with the terminal event rather than the entire disease progression. Motivated by informing optimal initial screening age for colorectal cancer (CRC) through cost-effectiveness analysis, we provide a unified measure of cost-effectiveness with semi-competing risks and multistate modeling, which allows us to gain insights on benefit and cost at each stage of cancer progression. Unlike most existing causal inference works focusing on static interventions, we develop a causal framework and estimation procedure to evaluate cost-effectiveness as a function of time-varying screening strategy. These methods are justified theoretically and numerically using both simulation and the CRC data from the Women's Health Initiative observational study.
Background: Identifying latent subgroups in heterogeneous populations is key to understanding disease mechanisms and advancing precision medicine. Although high-dimensional omics and longitudinal clinical data provide rich phenotypic and molecular insights, few methods jointly model outcome dynamics and molecular heterogeneity. We introduce TPClust, a supervised generative subtyping model that integrates longitudinal outcomes with high-dimensional molecular data, flexibly accounting for time-varying and static covariates.
Methods: TPClust models covariate effects as smooth functions of time via nonparametric splines and applies structured regularization—sparse group and exclusive lasso—for robust subtype-specific feature selection. Inference uses a scalable variational EM algorithm with bootstrap-based confidence intervals. We applied TPClust to 1,020 adults from the Religious Orders Study and Memory and Aging Project (ROSMAP), integrating longitudinal cognitive trajectories with postmortem prefrontal cortex transcriptomics in Alzheimer's Disease (AD). Analyses adjusted for sex, APOE ε4, and vascular risk factors. We estimated subtype-specific time-varying effects and examined differences in neuropathology, proteomic, and epigenomic markers. Simulation studies evaluated model accuracy.
Results: TPClust uncovered four distinct aging subtypes: Resilient (n=642), Late-Onset Decline (n=102), Early Vulnerability (n=76), and Rapid Decline (n=200). Resilient individuals maintained high cognition and low pathology with preserved synaptic and mitochondrial function. Late-Onset Decline remained stable until age 85, then exhibited accelerated decline among individuals with APOE ε4, diabetes, and stroke, accompanied by a moderate pathological burden. Early Vulnerability showed an earlier, steeper decline after age 84 and increased vulnerability to stroke, frailty, and male sex, along with reduced neuronal resilience and elevated stress-response markers. Rapid Decline exhibited the earliest deterioration (starting ~age 73), highest dementia risk (87% by age 85), and greatest burden of amyloid, tau, TDP-43, and vascular pathology, alongside broad vulnerability to genetic and vascular factors and dysregulation of tau transcription, blood–brain barrier integrity, and inflammation. Simulation studies confirmed TPClust's accuracy in subtyping, time-varying inference, and high-dimensional feature selection.
Conclusions: TPClust offers a robust framework for outcome-guided subtyping in longitudinal clinical data and molecular data. It reveals distinct cognitive and mechanistic profiles among aging and AD subtypes, advancing biomarker discovery, disease stratification, and precision medicine strategies.
Keywords
Disease subtyping
Integrative approach
Longitudinal clinical data
High-dimensional omics data
Clustering mixed-type data is a major challenge in biopharmaceutical research, particularly for phenotyping complex diseases where patient heterogeneity complicates treatment. Existing methods often assume local independence or fail to handle high-dimensional datasets with correlated continuous and categorical variables and censored biomarkers. We propose a Bayesian finite mixture model (BFMM) that integrates flexible dependence structures, spike-and-slab priors for variable importance, and a specialized Gibbs sampling step for imputing censored biomarkers. BFMM enables stable clustering and provides interpretable importance weights for both variable types, offering insights into cluster assignments. Simulations show BFMM outperforms existing methods, particularly for correlated data with varying censoring levels. Application to real-world datasets further validates its effectiveness. Our findings underscore BFMM's potential as a robust, interpretable tool for biomedical data analysis, with implications for precision medicine and targeted interventions.