Tree-guided equi-sparisty pursuit for high-dimensional regression and classification

Aaron Molstad Co-Author
University of Minnesota
 
Hui Zou Co-Author
University of Minnesota
 
Jinwen Fu First Author
 
Jinwen Fu Presenting Author
 
Wednesday, Aug 6: 10:05 AM - 10:20 AM
2015 
Contributed Papers 
Music City Center 
In high-dimensional linear models, sparsity is often assumed to control variability and improve model performance. Equi-sparsity, where one assumes that predictors can be aggregated into groups sharing the same effects, is an alternative parsimonious structure that may be more suitable for many applications. Previous work has also shown a benefit of such structures for prediction in the presence of "rare features". This paper proposes a tree-guided penalty for simultaneous estimation and group aggregation. Unlike existing methods, our estimator avoids the overparametrization and the unfair group selection problem therein. We provide a closed-form solution to the proximal operator, allowing for efficient computation despite hierarchically overlapped groups. Novel techniques are developed to study the finite-sample error bound of this seminorm-induced penalty under least squares and binomial deviance losses. Compared to existing methods, the proposed approach is often more favorable in high-dimensional settings, as verified by extensive simulation studies. The method is further illustrated with application in microbiome data where we conduct post-selection inference on group effects.

Keywords

feature aggregation

equi-sparsity

tree-guided regularization

high-dimensional linear models

post-selection inference

proximal operator 

Main Sponsor

Section on Statistical Learning and Data Science