Impact of unequal censoring on the comparison of survival curves

Dipenkumar Modi Co-Author
Karmanos Cancer Institute
 
Seongho Kim Co-Author
Wayne State University
 
Hyejeong Jang First Author
Wayne State University
 
Hyejeong Jang Presenting Author
Wayne State University
 
Wednesday, Aug 7: 9:35 AM - 9:40 AM
3327 
Contributed Speed 
Oregon Convention Center 

Description

This study investigates the impact of unequal censoring on comparisons of survival distributions. We evaluated the performance of five statistical tests: the log-rank (LR), Gehan-Breslow generalized Wilcoxon (GB), Tarone-Ware (TW), Peto-Peto (PP), and modified Peto-Peto (mPP) tests. Using 1,000 simulations comparing two survival curves, we assessed their size and power under four censoring patterns: overall, early, middle, and late censoring (total of 16 combinations). Additionally, we explored different scenarios with sample sizes of 20 and 50 per group and varying levels of censoring (10% and 25%). Regardless of sample size, censoring percentage, censoring patterns, LR test demonstrated the highest power overall while GB showed the least power. For a sample size 20 per group, only early-overall censoring reached more than 80% power for LR and TW tests. For a sample size 50 per group, early-overall, early-early, early-middle, early-late censoring appeared to minimally impact all five individual tests. We further investigated the effect of combining their p-values into a single value. The combined p values using the generalized Fisher method had higher power than LR test; however, the type I error rates were not well maintained compared to LR test.

Keywords

Log-rank test

Gehan-Breslow generalized Wilcoxon test

Tarone-Ware test

Peto-Peto test

Modified Peto-Peto test

Survival analysis 

Main Sponsor

Biometrics Section