Monday, Aug 5: 8:30 AM - 10:20 AM
1297
Invited Paper Session
Oregon Convention Center
Room: CC-F149
Applied
Yes
Main Sponsor
Section on Physical and Engineering Sciences
Co Sponsors
Technometrics
Uncertainty Quantification in Complex Systems Interest Group
Presentations
Bayesian optimization is a framework that leverages the ability of Gaussian processes to quantity uncertainty in order to efficiently solve black-box optimization problems. For many years, much work in this area has focused on relatively low dimensional continuous optimization problems where the objective function is highly expensive to evaluate, and is limited to a few hundred evaluations at most. In this talk, I'll discuss the application of Bayesian optimization to radically different optimization problems over challenging inputs like molecules, proteins, and database query plans. In these settings, practitioners may have access to vast libraries of known results, and the objective functions are structured, discrete and high dimensional. By uniting recent work on deep generative modelling, scalable Gaussian processes, and high dimensional black-box optimization, we are able to achieve up to a 20x performance improvement over state of the art on several of the most popular benchmarks for molecule design, and 5x improvements in query execution time over the built in PostgreSQL query optimizer.
Bayesian deep Gaussian processes (DGPs) outperform ordinary GPs as surrogate models when dynamics are non-stationary, which is especially prevalent in aerospace simulations. Yet DGP surrogates have not been deployed for the canonical downstream task in that setting: reliability analysis through contour location (CL). Level sets separating passable vs. failable operating conditions are best learned through strategic sequential design. There are two limitations to modern CL methodology which hinder DGP integration in this setting. First, derivative-based optimization underlying acquisition functions is thwarted by sampling-based Bayesian (i.e., MCMC) inference, which is essential for DGP posterior integration. Second, canonical acquisition criteria, such as entropy, are famously myopic to the extent that optimization may even be undesirable. Here we tackle both of these limitations at once, proposing a hybrid criteria that explores along the Pareto front of entropy and (predictive) uncertainty, requiring evaluation only at strategically located "triangulation" candidates. We showcase DGP CL performance in benchmark exercises and on a real-world RAE-2822 transonic airfoil simulation.
Online experiments in internet systems, also known as A/B tests, are used for a wide range of system optimization problems, such as optimizing recommender system ranking policies and adaptive streaming controllers. Decision-makers generally wish to optimize for long-term treatment effects of the system changes, which often requires running experiments for a long time as short-term measurements can be misleading due to non-stationarity in treatment effects over time. The sequential experimentation strategies---which typically involve several iterations---can be prohibitively long in such cases. We describe a novel approach that combines fast experiments (e.g., biased experiments run only for a few hours or days) and/or offline proxies (e.g., off-policy evaluation) with long-running, slow experiments to perform sequential, Bayesian optimization over large action spaces in a short amount of time.
Bayesian optimization is a class of global optimization techniques. In Bayesian optimization, the underlying objective function is modeled as a realization of a Gaussian process. Although the Gaussian process assumption implies a random distribution of the Bayesian optimization outputs, quantification of this uncertainty is rarely studied in the literature. In this work, we propose a novel approach to assess the output uncertainty of Bayesian optimization algorithms, which proceeds by constructing confidence regions of the maximum point (or value) of the objective function. These regions can be computed efficiently, and their confidence levels are guaranteed by the uniform error bounds for sequential Gaussian process regression newly developed in the present work. Our theory provides a unified uncertainty quantification framework for all existing sequential sampling policies and stopping criteria.
Speaker
Rui Tuo, Texas A&M University