Print Close

Lasso rates hold even after randomly discarding most of the predictors

Presented During: Advances in Sparse Learning

Nima Hejazi Co-Author
Harvard T.H. Chan School of Public Health

Salvador Balkus First Author
Harvard T.H. Chan School of Public Health

Salvador Balkus Presenting Author
Harvard T.H. Chan School of Public Health

Wednesday, Aug 6: 9:20 AM - 9:35 AM
1652
Contributed Papers

Music City Center

In statistical learning, algorithms that fit a lasso over a set of basis functions can achieve desirable theoretical properties. For example, the Highly Adaptive Lasso (HAL) estimator applies the lasso to a very high-dimensional indicator basis, attaining dimension-free rates of convergence across a large class of functions. However, the time complexity of such algorithms is often exponential in the number of features, meaning they are too computationally intensive for most practical data analysis problems. In this work, we show that a lasso-based empirical risk minimizer over a growing set of basis functions retains its asymptotic rate even if fit only on a random, relatively small subset of the basis. Applying this idea, we propose RandomHAL: a fast approximation to the HAL estimator that retains its desirable properties, yet can be fit on datasets with many more features. The empirical performance of RandomHAL is evaluated using simulation experiments.

Keywords

lasso

function estimation

empirical risk minimization

statistical learning

high-dimensional statistics

Main Sponsor

Section on Statistical Learning and Data Science