Bayesian Variable Selection for Ultra High-Dimensional Semiparametric Additive Partial Linear Models

Somak Dutta Co-Author
Iowa State University
 
Vivekananda Roy Co-Author
Iowa State University
 
Debarshi Chakraborty First Author
Iowa State University
 
Debarshi Chakraborty Presenting Author
Iowa State University
 
Monday, Aug 4: 2:35 PM - 2:50 PM
1813 
Contributed Papers 
Music City Center 
Semiparametric regression models containing linear and nonlinear additive components generalize multiple linear regression models.We prefer them to fully nonparametric models when some covariates have linear effects .While variable selection for multiple linear regression has been widely studied,work on additive partial linear models(APLMs) are more recent.We develop a Bayesian group selection method for APLMs using splines to approximate the nonlinear functions.Our work is based on a hierarchical model with priors on regression coefficients,spline coefficients,and model space.We prove model selection consistency even when the number of predictors grow nearly exponentially with sample size.We propose a scalable algorithm for exploring gigantic model spaces and efficiently detecting regions of high posterior probabilities.Various simulation setups are used to evaluate and compare our proposed approach's performance with other available methods. Analyzing data from a genome-wide association study with 360 observations on a particular trait of plants as response and nearly a million SNPs and 30000 gene expressions as predictors demonstrate scalability and performance of our approach.

Keywords

Genome wide association study

Hierarchical Model

Group selection

Stochastic Search

Additive Partial Linear Model

Posterior Prediction 

Main Sponsor

Section on Bayesian Statistical Science