Tuesday, Aug 5: 8:30 AM - 10:20 AM
0709
Topic-Contributed Paper Session
Music City Center
Room: CC-105B
In this session, the five student paper award winners from the Lifetime Data Science Section Student Paper Competition will present their award-winning work. These papers collectively cover several exciting novel developments in the field of survival analysis.
Lifetime analysis
quantile regression
multistate models
prediction
recurrent events
clinical trials
Applied
Yes
Main Sponsor
Lifetime Data Science Section
Presentations
We introduce a computationally efficient and general approach for utilizing multiple, possibly interval-censored, data streams to study complex biomedical endpoints using multistate semi-Markov models. Our motivating application is the REGEN-2069 trial, which investigated the protective efficacy (PE) of the monoclonal antibody combination REGEN-COV against SARS-CoV-2 when administered prophylactically to individuals in households at high risk of secondary transmission. Using data on symptom onset, episodic RT-qPCR sampling, and serological testing, we estimate the PE of REGEN-COV for asymptomatic infection, its effect on seroconversion following infection, and the duration of viral shedding. We find that REGEN-COV reduced the risk of asymptomatic infection and the duration of viral shedding, and led to lower rates of seroconversion among asymptomatically infected participants. Our algorithm for fitting semi-Markov models to interval-censored data employs a Monte Carlo expectation maximization (MCEM) algorithm combined with importance sampling to efficiently address the intractability of the marginal likelihood when data are intermittently observed. Our algorithm provide substantial computational improvements over existing methods and allows us to fit semi-parametric models despite complex coarsening of the data.
Keywords
Data Assimilation
Interval Censoring
Monte Carlo Expectation-Maximization
Semi-Markov Multistate Models
Panel Data
Splines
In contemporary cancer research, there is an increasing need to incorporate longitudinal biomarkers into randomized clinical trials to develop personalized treatment strategies that adapt to biomarkers measured prior to treatment. However, existing statistical methods for sample size/power calculations in clinical trials with survival outcomes often overlook the integration of longitudinal biomarker data. This article presents a sample size/power calculation formula with a robust inference method for estimating treatment effects without relying on distributional assumptions regarding random effects from longitudinal biomarkers. The proposed formula only requires easily accessible quantities from existing literature or pilot studies, avoiding any distributional constraints on survival or censoring times. Extensive simulation studies demonstrate that the proposed inference method and sample size/power calculation formula exhibit strong finite sample performances. The practical application of this method is illustrated through the design of a lung cancer prevention trial, utilizing data from the National Lung Screening Trial (NLST).
Keywords
Clinical trial design
Longitudinal-survival joint modeling
Personalized Treatment Strategies
Power Calculation
Speaker
Haolin Li, University of North Carolina at Chapel Hill
Censored survival data are common in clinical trials, but small control groups can pose challenges, particularly in rare diseases or where balanced randomization is impractical. Recent approaches leverage external controls from historical studies or real-world data to strengthen treatment evaluation for survival outcomes. However, using external controls directly may introduce biases due to data heterogeneity. We propose a doubly protected estimator for the treatment-specific restricted mean survival time difference that is more efficient than trial-only estimators and mitigates biases from external data. Our method adjusts for covariate shifts via doubly robust estimation and addresses outcome drift using the DR-Learner for selective borrowing. The approach incorporates machine learning to approximate survival curves and detect outcome drifts without strict parametric assumptions, borrowing only comparable external controls. Extensive simulation studies and a real-data application evaluating the efficacy of Galcanezumab in mitigating migraine headaches have been conducted to illustrate the effectiveness of our proposed framework.
Keywords
Adaptive learning
Monotone coarsening
unmeasured confounding
data heterogeneity
The integration of time-to-intermediate event data and the evolving characteristics of patients to enhance long-term prediction has garnered significant interest, driven by the wealth of data generated from longitudinal cohorts. In this talk, we propose sequential/dynamic prediction rules by using regression models with time-varying coefficients. We introduce a class of dynamic models that not only incorporates intermediate event information but also leverages information across different landmark times. To address the challenge of right-censoring, we employ an inverse weighting technique in the estimation process. We establish the asymptotic properties of the estimated parameters and conduct extensive simulations to assess the finite sample performance. We apply the proposed method to real-world data from the Atherosclerosis Risk in Communities (ARIC) study and predict mortality while incorporating information regarding a crucial intermediate event, the occurrence of a stroke, and other time-varying covariates dynamically.
Keywords
Dynamic prediction, Intermediate event, Landmark time, Long-term prediction, Time-varying effect
Co-Author(s)
Wen Li, University of Texas Health Science Center at Houston McGovern Medical School
Ruosha Li, University of Texas School of Public Health
Jing Ning, University of Texas, MD Anderson Cancer Center
Speaker
Yunyi Wang
Recurrent episode data frequently arise in chronic disease studies when an event of interest occurs repeatedly and each occurrence lasts for a random period of time. Understanding the heterogeneity in recurrent episode lengths can help guide dynamic and customized disease management. However, there has been relative sparse attention to methods tailored to this end. Existing approaches either do not confer direct interpretation on episode lengths or involve restrictive or unrealistic distributional assumptions, such as exchangeability of within-individual episode lengths. In this work, we propose a modeling strategy which overcomes these limitations through adopting quantile regression and sensibly incorporating time-dependent covariates. Viewing recurrent episodes as clustered data, we develop an estimation procedure which properly handles the special complications including dependent censoring, dependent truncation, and informative cluster size. Our estimation procedure is computationally simple and yields estimators with desirable asymptotic properties. Our numerical studies demonstrate the advantages of the proposed method over naive adaptions of existing approaches.
Keywords
Recurrent episode data
Alternating recurrent event process
Quantile regression
Informative cluster size
Dependent truncation