Tuesday, Aug 6: 10:30 AM - 12:20 PM
3154
Contributed Posters
Oregon Convention Center
We consider the variable selection problem for linear models in the M-open setting, where the data generating process is outside the model space. We focus on the novel problem of Model Superinduction, which refers to the tendency of model selection procedures to exponentially favor larger models as the sample size grows, resulting in overparametrized models which induce severe computational difficulties. We prove the existence of this phenomenon for popular classes of model selection priors, such as mixtures of g-priors and the family of spike and slab priors. We further show this behavior is inescapable for any KL-divergence minimizing model selection procedure, so we seek to minimize its effects for large n, while preserving posterior consistency. We propose variants of the aforementioned priors that result in a slowly diminishing rate of prior influence on the posterior, which favors simpler models while preserving consistency. We further propose a model space prior which induces stronger model complexity penalization for large sample sizes. We demonstrate the efficacy of our proposed solutions via synthetic data examples and a case study using albedo data from GOES satellites.
Model selection
Bayesian decision theory
M-open model comparison
Linear Models
Spike and Slab prior
g-prior
Main Sponsor
Section on Bayesian Statistical Science