Efficient Learning and Uncertainty Quantification in High-Dimensional and Computational Systems

Paromita Banerjee Chair
John Carroll University
 
Monday, Aug 4: 10:30 AM - 12:20 PM
4052 
Contributed Papers 
Music City Center 
Room: CC-205C 

Main Sponsor

Uncertainty Quantification in Complex Systems Interest Group

Co Sponsors

Uncertainty Quantification in Complex Systems Interest Group

Presentations

A Bayesian Optimization Framework for Personalized System Design Based on Computer Experiments

Personalized system design incorporates human characteristics into optimization to improve system responses. We propose a new Bayesian Optimization (BO) framework that develops a continuous design policy, which assigns optimal designs based on human covariates and minimizes population-wise expected responses. Current methods rely heavily on observational data and discrete designs, ignoring population and design variations. Traditional BO faces challenges in developing reliable surrogate models, requiring extensive space-filling simulations and often failing in individual response prediction. Our proposed BO method addresses these issues by using distinct objective functions across three sub-steps: training a Gaussian process surrogate, optimizing the design policy, and efficiently introducing new samples through a new acquisition function, Personalized Information Gain (PIG). This function focuses on informative simulation runs to reduce uncertainty and improve computational efficiency by searching along the optimal design policy. Numerical examples using synthetic data and vehicle restraint system design demonstrate the effectiveness and robustness of the proposed method. 

Keywords

Bayesian Optimization

Personalized System Design

Vehicle Restraint System Design 

Co-Author(s)

Wenbo Sun
Jingwen Hu, University of Michigan Transportation Research Institute
Judy Jin, University of Michigan

First Author

Jiacheng Liu, University of Michigan

Presenting Author

Jiacheng Liu, University of Michigan

WITHDRAWN A Likelihood-Based Approach to Distribution Regression Using Conditional Deep Generative Models

In this work, we explore the theoretical properties of conditional deep generative models under the statistical framework of distribution regression where the response variable lies in a high-dimensional ambient space but concentrates around a potentially lower-dimensional manifold. More specifically, we study the large-sample properties of a likelihood-based approach for estimating these models. Our results lead to the convergence rate of a sieve maximum likelihood estimator (MLE) for estimating the conditional distribution (and its devolved counterpart) of the response given predictors in the Hellinger (Wasserstein) metric. Our rates depend solely on the intrinsic dimension and smoothness of the true conditional distribution. These findings provide an explanation of why conditional deep generative models can circumvent the curse of dimensionality from the perspective of statistical foundations and demonstrate that they can learn a broader class of nearly singular conditional distributions. Our analysis also emphasizes the importance of introducing a small noise perturbation to the data when they are supported sufficiently close to a manifold. 

Keywords

Deep Generative Model

Conditional Distribution

Smoothness Disparity

Sieve MLE

Manifold

Curse of dimensionality 

Co-Author(s)

Yun Yang, University of Illinois Urbana-Champaign
Lizhen Lin

First Author

Shivam Kumar

Uncertainty of Network Topology with Applications to Out-of-Distribution Detection

Persistent homology is a crucial concept in computational topology, providing a multiscale topological description of a space. It is particularly significant in topological data analysis, which aims to make statistical inference from a topological perspective. In this work, we introduce a new topological summary for Bayesian neural networks, termed the predictive topological uncertainty (pTU). The proposed pTU measures the uncertainty in the interaction between the model and the inputs. It provides insights from the model perspective: if two samples interact with a model in a similar way, then they are considered identically distributed. We also show that the pTU is insensitive to the model architecture. As an application, pTU is used to solve the out-of-distribution (OOD) detection problem, which is critical to ensure model reliability. Failure to detect OOD input can lead to incorrect and unreliable predictions. To address this issue, we propose a significance test for OOD based on the pTU, providing a statistical framework for this issue. The effectiveness of the framework is validated through various experiments, in terms of its statistical power, sensitivity, and robustness. 

Keywords

Persistent Homology

Bayesian Neural Network

Out-of-Distribution

Uncertainty

Topological Data Analysis 

Co-Author

Sing-Yuan Yeh, National Taiwan University

First Author

Chun-Hao Yang, National Taiwan University

Presenting Author

Chun-Hao Yang, National Taiwan University

Fast data inversion for high-dimensional dynamical systems from noisy measurements

In this work, we develop a scalable approach for a flexible latent factor model for high-dimensional dynamical systems. Each latent factor process has its own correlation and variance parameters, and the orthogonal factor loading matrix can be either fixed or estimated. We utilize an orthogonal factor loading matrix that avoids computing the inversion of the posterior covariance matrix at each time of the Kalman filter, and derive closed-form expressions in an expectation-maximization algorithm for parameter estimation, which substantially reduces the computational complexity without approximation. Our study is motivated by inversely estimating slow slip events from geodetic data, such as continuous GPS measurements. Extensive simulated studies illustrate higher accuracy and scalability of our approach compared to alternatives. By applying our method to geodetic measurements in the Cascadia region, our estimated slip better agrees with independently measured seismic data of tremor events. The substantial acceleration from our method enables the use of massive noisy data for geological hazard quantification and other applications. 

Keywords

Bayesian prior

latent factor models

Gaussian processes

expectation-maximization algorithm

Kalman filter 

Co-Author(s)

Xubo Liu, University of California, Santa Barbara
Paul Segall, Stanford University
Mengyang Gu, University of California, Santa Barbara

First Author

Yizi Lin

Presenting Author

Mengyang Gu, University of California, Santa Barbara

Multi-Task Active Learning with Efficient Resource Allocation

A sufficient amount of high-quality labeled data is essential for training supervised machine learning models; however, labeling all data can be costly, particularly in domains like healthcare and image classification. While Active learning (AL) has emerged as a promising approach to improve labeling efficiency, existing frameworks overlook scenarios where supplementary tasks with varying resource costs can assist the labeling process, such as out-of-network diagnostic tests. In this work, we introduce Active Learning with Cost-Adaptive Task Resource Allocations (ALCATRAs), a novel framework designed to optimize task resource allocation and improve model predictive performance under limited budgets. ALCATRAs consists of two main components: a task-selection policy which strategically selects a sequence of cost-effective tasks for unlabeled data to perform, and a surrogate learning procedure which transfers knowledge from completed tasks to enhance model predictions. Extensive experiments and applications to UC electronic health records (EHR) and the FashionMNIST benchmark dataset demonstrate the superior sample efficiency of our proposed ALCATRAs framework. 

Keywords

Active learning

Sequential decision-making

Surrogate learning

Sample efficiency

Feature acquisition

Resource allocation 

Co-Author

Annie Qu, University of California At Irvine

First Author

Hanwen Ye

Presenting Author

Hanwen Ye