Allele-frequency estimation and ancestry informative marker identification via retrospective regression

Lin Zhang Speaker
Simon Fraser University
 
Sunday, Aug 3: 4:30 PM - 4:55 PM
Invited Paper Session 
Music City Center 
Allele frequency estimation at a genetic marker plays a pivotal role in genetic studies. The accuracy of allele frequency estimation impacts the accuracy and power of a genome-wide association study (GWAS). Moreover, allele frequency may differ between seemingly similar populations, which makes allele frequency estimation particularly important for identifying ancestral informative markers (AIMs). Yet, existing allele frequency estimation methods mostly rely on independent sample from a homogeneous population and cannot provide closed form solutions for the maximum likelihood estimator (MLE) of the allele frequencies. To address these challenges, we propose a retrospective regression framework that takes genotype as the response variable, and population and other covariates as the dependent variable. The regression nature of our proposed method enables it to estimate allele frequency in heterogeneous populations and accommodate sample correlation. We support our analytical findings using the 1000 Genome Project genotype data of five super-populations.