Pretraining and the Lasso

Mert Pilanci Co-Author
Stanford University
 
Balasubramanian Narasimhan Co-Author
Stanford University
 
Julia Salzman Co-Author
Stanford University
 
Jonathan Taylor Co-Author
Stanford University
 
Robert Tibshirani Co-Author
Stanford University
 
Erin Craig Speaker
 
Sunday, Aug 3: 4:25 PM - 4:45 PM
Topic-Contributed Paper Session 
Music City Center 
Pretraining is a popular and powerful paradigm in machine learning to pass information from one dataset to another. For example, suppose we have a modest-sized dataset of images of cats and dogs, and we plan to fit a neural network to classify them. With pretraining, we start with a neural network trained on a large corpus of images, consisting of not just cats and dogs but hundreds of other image types. We then fix all network weights except the top layer(s), which perform the final classification, and fine-tune those on our dataset. This often results in dramatically better performance than the network trained solely on our smaller dataset.

In this talk, I will present a framework for pretraining the lasso, which allows us to enjoy the performance benefits of pretraining while retaining the interpretability and simplicity of sparse linear modeling. Suppose for example we wish to predict cancer survival time using a dataset that spans multiple cancer types. With lasso pretraining, we start by fitting a lasso model using the entire dataset, then we use this to guide the fitting of a specific model for each cancer type. Importantly, we have a hyperparameter which determines the influence of the overall model on the specific models. This process also reveals which features are predictive for most or all classes, and which are predictive for one or just a few. This latter set will often be of most interest to the scientist.

Lasso pretraining is a general framework with a wide variety of applications, including stratified models, multi-response models and conditional average treatment estimation, and I will demonstrate its use with real-world biomedical examples.