Lasso rates hold even after randomly discarding most of the predictors
Nima Hejazi
Co-Author
Harvard T.H. Chan School of Public Health
Salvador Balkus
Presenting Author
Harvard T.H. Chan School of Public Health
Wednesday, Aug 6: 9:20 AM - 9:35 AM
1652
Contributed Papers
Music City Center
In statistical learning, algorithms that fit a lasso over a set of basis functions can achieve desirable theoretical properties. For example, the Highly Adaptive Lasso (HAL) estimator applies the lasso to a very high-dimensional indicator basis, attaining dimension-free rates of convergence across a large class of functions. However, the time complexity of such algorithms is often exponential in the number of features, meaning they are too computationally intensive for most practical data analysis problems. In this work, we show that a lasso-based empirical risk minimizer over a growing set of basis functions retains its asymptotic rate even if fit only on a random, relatively small subset of the basis. Applying this idea, we propose RandomHAL: a fast approximation to the HAL estimator that retains its desirable properties, yet can be fit on datasets with many more features. The empirical performance of RandomHAL is evaluated using simulation experiments.
lasso
function estimation
empirical risk minimization
statistical learning
high-dimensional statistics
Main Sponsor
Section on Statistical Learning and Data Science
You have unsaved changes.