Lifetime Data Science (LiDS) Section Student Paper Awards

Pamela Shaw Chair
Kaiser Permanente Washington Health Research Institute
 
Pamela Shaw Organizer
Kaiser Permanente Washington Health Research Institute
 
Tuesday, Aug 5: 8:30 AM - 10:20 AM
0709 
Topic-Contributed Paper Session 
Music City Center 
Room: CC-105B 

In this session, the five student paper award winners from the Lifetime Data Science Section Student Paper Competition will present their award-winning work. These papers collectively cover several exciting novel developments in the field of survival analysis.

Keywords

Lifetime analysis

quantile regression

multistate models

prediction

recurrent events

clinical trials 

Applied

Yes

Main Sponsor

Lifetime Data Science Section

Presentations

Assessing treatment efficacy for interval-censored endpoints using multistate semi-Markov models fit to multiple data streams

We introduce a computationally efficient and general approach for utilizing multiple, possibly interval-censored, data streams to study complex biomedical endpoints using multistate semi-Markov models. Our motivating application is the REGEN-2069 trial, which investigated the protective efficacy (PE) of the monoclonal antibody combination REGEN-COV against SARS-CoV-2 when administered prophylactically to individuals in households at high risk of secondary transmission. Using data on symptom onset, episodic RT-qPCR sampling, and serological testing, we estimate the PE of REGEN-COV for asymptomatic infection, its effect on seroconversion following infection, and the duration of viral shedding. We find that REGEN-COV reduced the risk of asymptomatic infection and the duration of viral shedding, and led to lower rates of seroconversion among asymptomatically infected participants. Our algorithm for fitting semi-Markov models to interval-censored data employs a Monte Carlo expectation maximization (MCEM) algorithm combined with importance sampling to efficiently address the intractability of the marginal likelihood when data are intermittently observed. Our algorithm provide substantial computational improvements over existing methods and allows us to fit semi-parametric models despite complex coarsening of the data. 

Keywords

Data Assimilation

Interval Censoring

Monte Carlo Expectation-Maximization

Semi-Markov Multistate Models

Panel Data

Splines 

Speaker

Raphael Morsomme, Food and Drug Administration

Design and Analysis of Clinical Trials with Survival Outcome by Incorporating Pre-Randomization Longitudinal Biomarkers

In contemporary cancer research, there is an increasing need to incorporate longitudinal biomarkers into randomized clinical trials to develop personalized treatment strategies that adapt to biomarkers measured prior to treatment. However, existing statistical methods for sample size/power calculations in clinical trials with survival outcomes often overlook the integration of longitudinal biomarker data. This article presents a sample size/power calculation formula with a robust inference method for estimating treatment effects without relying on distributional assumptions regarding random effects from longitudinal biomarkers. The proposed formula only requires easily accessible quantities from existing literature or pilot studies, avoiding any distributional constraints on survival or censoring times. Extensive simulation studies demonstrate that the proposed inference method and sample size/power calculation formula exhibit strong finite sample performances. The practical application of this method is illustrated through the design of a lung cancer prevention trial, utilizing data from the National Lung Screening Trial (NLST). 

Keywords

Clinical trial design

Longitudinal-survival joint modeling

Personalized Treatment Strategies

Power Calculation 

Speaker

Haolin Li, University of North Carolina at Chapel Hill

Doubly Protected Estimation for Survival Outcomes Utilizing External Controls for Randomized Clinical Trials

Censored survival data are common in clinical trials, but small control groups can pose challenges, particularly in rare diseases or where balanced randomization is impractical. Recent approaches leverage external controls from historical studies or real-world data to strengthen treatment evaluation for survival outcomes. However, using external controls directly may introduce biases due to data heterogeneity. We propose a doubly protected estimator for the treatment-specific restricted mean survival time difference that is more efficient than trial-only estimators and mitigates biases from external data. Our method adjusts for covariate shifts via doubly robust estimation and addresses outcome drift using the DR-Learner for selective borrowing. The approach incorporates machine learning to approximate survival curves and detect outcome drifts without strict parametric assumptions, borrowing only comparable external controls. Extensive simulation studies and a real-data application evaluating the efficacy of Galcanezumab in mitigating migraine headaches have been conducted to illustrate the effectiveness of our proposed framework. 

Keywords

Adaptive learning

Monotone coarsening

unmeasured confounding

data heterogeneity 

Speaker

Chenyin Gao, North Carolina State University

Dynamic long-term prediction with intermediate event information: a flexible model with bivariate time-varying coefficients

The integration of time-to-intermediate event data and the evolving characteristics of patients to enhance long-term prediction has garnered significant interest, driven by the wealth of data generated from longitudinal cohorts. In this talk, we propose sequential/dynamic prediction rules by using regression models with time-varying coefficients. We introduce a class of dynamic models that not only incorporates intermediate event information but also leverages information across different landmark times. To address the challenge of right-censoring, we employ an inverse weighting technique in the estimation process. We establish the asymptotic properties of the estimated parameters and conduct extensive simulations to assess the finite sample performance. We apply the proposed method to real-world data from the Atherosclerosis Risk in Communities (ARIC) study and predict mortality while incorporating information regarding a crucial intermediate event, the occurrence of a stroke, and other time-varying covariates dynamically. 

Keywords

Dynamic prediction, Intermediate event, Landmark time, Long-term prediction, Time-varying effect 

Co-Author(s)

Wen Li, University of Texas Health Science Center at Houston McGovern Medical School
Ruosha Li, University of Texas School of Public Health
Jing Ning, University of Texas, MD Anderson Cancer Center

Speaker

Yunyi Wang

Exploring the Heterogeneity in Recurrent Episode Lengths Based On Quantile Regression

Recurrent episode data frequently arise in chronic disease studies when an event of interest occurs repeatedly and each occurrence lasts for a random period of time. Understanding the heterogeneity in recurrent episode lengths can help guide dynamic and customized disease management. However, there has been relative sparse attention to methods tailored to this end. Existing approaches either do not confer direct interpretation on episode lengths or involve restrictive or unrealistic distributional assumptions, such as exchangeability of within-individual episode lengths. In this work, we propose a modeling strategy which overcomes these limitations through adopting quantile regression and sensibly incorporating time-dependent covariates. Viewing recurrent episodes as clustered data, we develop an estimation procedure which properly handles the special complications including dependent censoring, dependent truncation, and informative cluster size. Our estimation procedure is computationally simple and yields estimators with desirable asymptotic properties. Our numerical studies demonstrate the advantages of the proposed method over naive adaptions of existing approaches. 

Keywords

Recurrent episode data

Alternating recurrent event process

Quantile regression

Informative cluster size

Dependent truncation 

Speaker

Yi Liu