Calculating sample size for methylation sequencing studies

Jeanette Eckel-Passow Co-Author
Mayo Clinic
 
William Taylor Co-Author
Mayo Clinic
 
John Kisiel Co-Author
Mayo Clinic
 
Douglas Mahoney Co-Author
Mayo Clinic
 
Seth Slettedahl First Author
 
Seth Slettedahl Presenting Author
 
Monday, Aug 4: 8:50 AM - 8:55 AM
2361 
Contributed Speed 
Music City Center 
Background: DNA methylation regulates the expression of genes and therefore can be utilized for several applications, including detection of differentially methylated sites or regions. Little guidance is available for determining sample size to adequately power a study .
Methods: To calculate sample size and power, an over-dispersed binomial model was utilized. We performed an empirical review of sequencing studies conducted between 2011-2018 at our institution to calculate the overdispersion parameter and median read depth across 4 disease types, including normal tissues for a total of 352 samples.
Results: The median overdispersion parameter was 2.4 [IQR, 1.8-3.6] and a median read depth of 31 [26-34]. Assuming no overdispersion, the required sample size to detect a difference of 2% in controls to 6% in cases, 12 samples per group is required. Based on an overdispersion of 2.4, 29 samples per group is required to achieve 80% power.
Conclusion: The overdispersion parameter differed between tissues and platforms. These empirical results can help provide guidance in calculating sample size and power. We recommend methylation studies should account for inflated variances.

Keywords

Methylation Sequencing

Sample Size Calculation

Binomial Overdispersion 

Main Sponsor

Section on Statistics in Genomics and Genetics