Breiman Award Lectures
Wednesday, Aug 6: 10:30 AM - 12:20 PM
0160
Invited Paper Session
Music City Center
Room: CC-208A
Machine Learning
Artificial Intelligence
Trustworthiness
Applied
No
Main Sponsor
Section on Statistical Learning and Data Science
Presentations
I will discuss our recent work on stochastic optimization with
equality constraints. We consider solving nonlinear optimization
problems with stochastic objective and deterministic
equality/inequality constraints. I will describe development of
adaptive algorithms based on sequential quadratic programming and
their properties.
Joint work with Sen Na, Yuchen Fang, Michael Mahoney, and Mihai Anitescu.
Modern high-performance predictive algorithms such as large neural networks and tree ensembles are black box with many parameters. For example, it is routine to see performant language models with billions of parameters --- this poses challenges in model deployment (especially in resource constrained settings such as edge devices) necessitating compression or simplification of these models. In business analytics pipelines it is common to see large-scale gradient boosted trees with high predictive performance which can be difficult to interpret due to their massive sizes. Can we compress these performant models for example, by removing weights/neurons/layers in neural networks; or trees/rules/depths for tree ensembles while retaining the performance of the original model? Can we select a small collection of rules from a large tree ensemble so they are stable? I will discuss how to formulate these tasks as instances of constrained discrete optimization problems, and discuss the computational and statistical aspects of these estimators.
In this paper, we introduce "uniLasso"—a novel statistical method for regression. This two-stage approach preserves the signs of the univariate coefficients and leverages their magnitude. Both of these properties are attractive for stability and interpretation of the model. Through comprehensive simulations and applications to real-world datasets, we demonstrate that uniLasso outperforms lasso in various settings, particularly in terms of sparsity and model interpretability. We prove asymptotic support recovery and mean-squared error consistency under a set of conditions different from the well-known irrepresentability conditions for the lasso. Extensions to generalized linear models (GLMs) and Cox regression are also discussed.
You have unsaved changes.