Evaluation of Rare Variant Association Methods When Incorporating Data from Multiple Sources

Audrey Hendricks Co-Author
University of Colorado Denver
 
Jessica Murphy First Author
 
Jessica Murphy Presenting Author
 
Wednesday, Aug 6: 3:20 PM - 3:35 PM
1551 
Contributed Papers 
Music City Center 
Rare variant genetic associations are crucial to understanding complex traits and diseases. Yet, the large sample sizes needed to observe rare variants can be difficult to ascertain. Incorporating public summary data as external controls, meta-analyzing existing case-control studies, or combining different study types (e.g., case-only, control-only) can boost power by increasing sample sizes. However, using data from multiple sources can cause bias due to differences in sample ascertainment and processing. Here, we compare the performance of rare variant association methods designed to incorporate external controls (iECAT-O and ProxECAT) with a new method (LogProx) that can leverage data from multiple sources. We also use SKAT-O, which was not designed for external data, as a baseline comparison. We find that SKAT-O often has optimal power, even without external controls, but ProxECAT and LogProx are the most powerful given a moderate proportion of cases to internal controls (e.g., ≥4:1). By identifying the scenarios (e.g., study designs, sample sizes) where the use of additional data sources is most beneficial, we hope to aid in the discovery of new genetic associations.

Keywords

statistical genetics

rare variant association methods

public summary data

external controls 

Main Sponsor

Section on Statistics in Genomics and Genetics