Penalized linear mixed models for correlated genetic data

Patrick Breheny Co-Author
University of Iowa
 
Tabitha Peter First Author
University of Iowa
 
Tabitha Peter Presenting Author
University of Iowa
 
Monday, Aug 4: 9:35 AM - 9:50 AM
2086 
Contributed Papers 
Music City Center 
Background

As genome-wide association studies (GWAS) aim to represent diverse populations and examine the heritability of complex traits, there emerges genetic data with multiple layers of correlation, e.g., family groups within different data collection sites. Such correlation structure motivated our innovative application of high-dimensional regression. We propose a methodology and a new R package for applying penalized linear mixed models to correlated genetic data.

Methods

We introduce a novel projection technique to decorrelate structured genetic data. Our approach addresses practical model-building challenges, including cross-validation. The methodology is implemented in our R/C++ package, plmmr, which fits the regression model without reading data into memory, enabling scalability to GWAS-sized analyses.

Results

We demonstrate our method using data from a GWAS of orofacial clefts which involved family groups from multiple global sites.

Discussion

We will explore how our approach may be used to create polygenic risk scores.

Keywords

Statistical genetics

GWAS

High-dimensional regression

lasso

Statistical computing 

Main Sponsor

Section on Statistics in Genomics and Genetics