Print Close

A computationally efficient TreeScan implementation via pruning

Presented During: Computationally Intensive Methods

Shirley Wang Co-Author

Georg Hahn First Author

Georg Hahn Presenting Author

Wednesday, Aug 6: 10:35 AM - 10:50 AM
1811
Contributed Papers

Music City Center

Tree-based scan statistics (TBSSs) are popular methods to conduct disproportionality analyses with many hierarchically related outcomes, allowing one to search for potential increased risks of drugs and vaccines among thousands of hierarchically related outcomes. Using the Dvoretzky-Kiefer-Wolfowitz inequality, we compute statistically valid bounds on the p-values calculated by TBSS, and we use those bounds in a two-fold manner. First, we quickly estimate the number of signals in the data using the fact that for non-significant nodes, p-value lower bounds usually indicate a departure from the significance threshold early on in the TBSS run. Second, we prune non-significant nodes, thereby reducing the size of the tree and speeding up the computation. Using a real data example of clinical relevance (risk assessment of SGLT2 and GLP1 inhibitors via hierarchical testing given by the International Classification of Diseases, ICD-10), we demonstrate that pruning allows one to considerably reduce the computational effort of TBSS while discovering the same signals.

Keywords

hypothesis testing

Treescan

hierarchical outcomes

International Classification of Diseases

TBSS

risks of drugs

Main Sponsor

Section on Statistical Computing