Efficient and accurate framework for rare variant associations in biobank-scale time-to-event data

Xihao Li Co-Author
University of North Carolina at Chapel Hill
 
Hufeng Zhou Co-Author
Harvard University
 
Zilin Li Co-Author
Northeast Normal University
 
Xihong Lin Co-Author
Harvard T.H. Chan School of Public Health
 
Shuang Song First Author
Harvard T.H. Chan School of Public Health
 
Shuang Song Presenting Author
Harvard T.H. Chan School of Public Health
 
Sunday, Aug 3: 5:05 PM - 5:20 PM
2160 
Contributed Papers 
Music City Center 
Rare variants (RVs) play a key role in complex disease genetics. Advances in WGS/WES have facilitated the identification of RV associations. However, RV analysis faces challenges in statistical power due to low allele frequencies. The issues are compounded for time-to-event (TTE) phenotypes, where high censoring rates and population structure can violate assumptions of standard association tests.
Here we present GATE-STAAR, an efficient and accurate framework for RV association tests of TTE phenotypes. We extend burden test, SKAT, and ACAT for TTE phenotypes, and use saddlepoint and Gamma approximations to calibrate test statistics under extreme censorings. Functional annotations are integrated to improve statistical power. The method accounts for population structure while maintaining computational scalability for large biobank-scale datasets.
Through extensive simulations, we demonstrate that GATE-STAAR substantially improves power with type I error well-controlled. Applied to 500K UKBB WGS data, we identified novel signals with implications for disease onset and progression. The findings highlight the promise of RV analyses for advancing our understanding of disease etiology.

Keywords

rare variants

time-to-event phenotypes

biobank studies

saddlepoint approximation

functional annotations

SKAT 

Main Sponsor

Section on Statistics in Genomics and Genetics