Out-of-sample risk estimation in no time flat

Parth Nobel Co-Author
Stanford University
 
Emmanuel Candes Co-Author
Stanford University
 
Daniel LeJeune First Author
Stanford University
 
Daniel LeJeune Presenting Author
Stanford University
 
Wednesday, Aug 7: 11:20 AM - 11:35 AM
2344 
Contributed Papers 
Oregon Convention Center 
Hyperparameter tuning is an essential part of statistical machine learning pipelines, and becomes more computationally challenging as datasets become large. Furthermore, the standard method of k-fold cross-validation is known to be inconsistent for high-dimensional problems. We propose instead an efficient implementation of approximate leave-one-out (ALO) risk estimation, providing consistent risk estimation in high-dimensions at a fraction of the cost of k-fold cross-validation. We leverage randomized numerical linear algebra and reduce the computational task to a handful of quasisemidefinite linear systems, equivalent to equality-constrained quadratic programs, for any convex non-smooth loss and linear-separable regularizer.

Keywords

Risk estimation

Cross-validation

High dimensions

Convex optimization

Randomized methods 

Main Sponsor

Section on Statistical Learning and Data Science