Algorithms for Simplifying Large Black Box Models

Rahul Mazumder Speaker
Massachusetts Institute of Technology
 
Wednesday, Aug 6: 11:05 AM - 11:35 AM
Invited Paper Session 
Music City Center 
Modern high-performance predictive algorithms such as large neural networks and tree ensembles are black box with many parameters. For example, it is routine to see performant language models with billions of parameters --- this poses challenges in model deployment (especially in resource constrained settings such as edge devices) necessitating compression or simplification of these models. In business analytics pipelines it is common to see large-scale gradient boosted trees with high predictive performance which can be difficult to interpret due to their massive sizes. Can we compress these performant models for example, by removing weights/neurons/layers in neural networks; or trees/rules/depths for tree ensembles while retaining the performance of the original model? Can we select a small collection of rules from a large tree ensemble so they are stable? I will discuss how to formulate these tasks as instances of constrained discrete optimization problems, and discuss the computational and statistical aspects of these estimators.