Evaluating two-sample tests for differences in survival in the presence of long-term survivors
Tuesday, Aug 5: 11:20 AM - 11:35 AM
1927
Contributed Papers
Music City Center
Time-to-event data with long-term survivors (L-TS), subjects who never experience the event, occur in diverse fields (e.g., cancer, credit default risk, recidivism). Conventional two-sample tests (e.g., log-rank test [LR]) ignore L-TS, and several alternatives exist, but they have not been comprehensively compared. We compared 7 methods via simulation: LR, three weighted log-rank tests (WLR), two adaptive tests (two-stage or Yang-Prentice [YP]), and a correctly specified parametric model. We assessed the impact of sample size and follow-up time on type I error and power across varying effect sizes. When one or both groups lack L-TS, the LR, WLR and YP typically have the highest power, but order varies. When both groups have L-TS, these tests have non-monotonic power as a function of follow-up time, but parametric models have monotonic increasing power and the highest power at the longest follow-up time. Patterns are consistent across sample sizes. We explain non-monotonicity by differential deviation from proportional hazards depending on follow-up time. This impacts study planning in the setting of L-TS, as naïve use of conventional LR can have counterintuitive properties.
Survival analysis
Long-term survivors
Log-rank test
Mixture cure model
Adaptive tests
Main Sponsor
Biometrics Section
You have unsaved changes.