A tree-based scan statistic for database studies with time-to-event outcomes

Georg Hahn Co-Author
 
Shirley Wang Co-Author
 
Massimiliano Russo First Author
The Ohio State University
 
Massimiliano Russo Presenting Author
The Ohio State University
 
Thursday, Aug 7: 8:35 AM - 8:50 AM
2773 
Contributed Papers 
Music City Center 
Tree-based scan statistics (TBSSs) are machine learning methods for disproportionality analyses in database studies. They simultaneously scan for thousands of hierarchically related outcomes to detect potential signals of harm from health products while controlling for multiplicity. They have been extensively used in pharmacoepidemiology. Current TBSS implementations do not allow for comparative safety evaluation with time-to-event outcomes, available in most database studies. Explicitly accounting for person time can improve the power to detect signals compared to methods that only use number of events. We propose three novel TBSSs for time-to-event data. The first assumes proportional Hazard Rates (HRs) for each node and uses a permutation scheme for inference. The second builds on exponential survival models for the terminal nodes of the hierarchy, implying a constant HR for each node. It uses a parametric bootstrap for inference. The third approach uses robust asymptotic approximations of the HRs to build an approximate parametric bootstrap. We compare the proposed methods with standard TBSSs in various simulation scenarios and database study.

Keywords

Data mining

Epidemiology

Multiple testing

Scan statistics

Tree variable. 

Main Sponsor

Section on Statistics in Epidemiology