Recent Advances in Active Learning and Bayesian Optimization

Annie Booth Chair
NC State University
 
Chih-Li Sung Organizer
Michigan State University
 
Monday, Aug 5: 8:30 AM - 10:20 AM
1297 
Invited Paper Session 
Oregon Convention Center 
Room: CC-F149 

Applied

Yes

Main Sponsor

Section on Physical and Engineering Sciences

Co Sponsors

Technometrics
Uncertainty Quantification in Complex Systems Interest Group

Presentations

Bayesian Optimization for High-Dimensional and Structured Problems

Bayesian optimization is a framework that leverages the ability of Gaussian processes to quantity uncertainty in order to efficiently solve black-box optimization problems. For many years, much work in this area has focused on relatively low dimensional continuous optimization problems where the objective function is highly expensive to evaluate, and is limited to a few hundred evaluations at most. In this talk, I'll discuss the application of Bayesian optimization to radically different optimization problems over challenging inputs like molecules, proteins, and database query plans. In these settings, practitioners may have access to vast libraries of known results, and the objective functions are structured, discrete and high dimensional. By uniting recent work on deep generative modelling, scalable Gaussian processes, and high dimensional black-box optimization, we are able to achieve up to a 20x performance improvement over state of the art on several of the most popular benchmarks for molecule design, and 5x improvements in query execution time over the built in PostgreSQL query optimizer. 

Speaker

Jacob Gardner, University of Pennsylvania

Contour Location for Reliability in Airfoil Simulation Experiments using Deep Gaussian Processes

Bayesian deep Gaussian processes (DGPs) outperform ordinary GPs as surrogate models when dynamics are non-stationary, which is especially prevalent in aerospace simulations. Yet DGP surrogates have not been deployed for the canonical downstream task in that setting: reliability analysis through contour location (CL). Level sets separating passable vs. failable operating conditions are best learned through strategic sequential design. There are two limitations to modern CL methodology which hinder DGP integration in this setting. First, derivative-based optimization underlying acquisition functions is thwarted by sampling-based Bayesian (i.e., MCMC) inference, which is essential for DGP posterior integration. Second, canonical acquisition criteria, such as entropy, are famously myopic to the extent that optimization may even be undesirable. Here we tackle both of these limitations at once, proposing a hybrid criteria that explores along the Pareto front of entropy and (predictive) uncertainty, requiring evaluation only at strategically located "triangulation" candidates. We showcase DGP CL performance in benchmark exercises and on a real-world RAE-2822 transonic airfoil simulation. 

Speaker

Robert Gramacy, Virginia Tech

Experimenting, Fast and Slow: Bayesian Optimization of Long-term Outcomes with Online Experiments

Online experiments in internet systems, also known as A/B tests, are used for a wide range of system optimization problems, such as optimizing recommender system ranking policies and adaptive streaming controllers. Decision-makers generally wish to optimize for long-term treatment effects of the system changes, which often requires running experiments for a long time as short-term measurements can be misleading due to non-stationarity in treatment effects over time. The sequential experimentation strategies---which typically involve several iterations---can be prohibitively long in such cases. We describe a novel approach that combines fast experiments (e.g., biased experiments run only for a few hours or days) and/or offline proxies (e.g., off-policy evaluation) with long-running, slow experiments to perform sequential, Bayesian optimization over large action spaces in a short amount of time. 

Speaker

Eytan Bakshy, Facebook

Uncertainty Quantification for Bayesian Optimization

Bayesian optimization is a class of global optimization techniques. In Bayesian optimization, the underlying objective function is modeled as a realization of a Gaussian process. Although the Gaussian process assumption implies a random distribution of the Bayesian optimization outputs, quantification of this uncertainty is rarely studied in the literature. In this work, we propose a novel approach to assess the output uncertainty of Bayesian optimization algorithms, which proceeds by constructing confidence regions of the maximum point (or value) of the objective function. These regions can be computed efficiently, and their confidence levels are guaranteed by the uniform error bounds for sequential Gaussian process regression newly developed in the present work. Our theory provides a unified uncertainty quantification framework for all existing sequential sampling policies and stopping criteria. 

Speaker

Rui Tuo, Texas A&M University