Frontiers in Models Based on Gaussian Process Priors: Asymptotics, Applications, and Scalability

Terrance Savitsky Chair
US Bureau of Labor Statistics
 
Sanvesh Srivastava Organizer
University of Iowa
 
Tuesday, Aug 6: 2:00 PM - 3:50 PM
1089 
Invited Paper Session 
Oregon Convention Center 
Room: CC-C123 

Applied

Yes

Main Sponsor

Section on Bayesian Statistical Science

Co Sponsors

International Indian Statistical Association
International Society for Bayesian Analysis (ISBA)

Presentations

Bayesian Fixed-Domain Asymptotics for Covariance Parameters in Spatial Gaussian Process Models

Gaussian process models typically contain finite dimensional parameters in the covariance function that need to be estimated from the data. We establish new Bayesian fixed-domain asymptotic theory for the covariance parameters in spatial Gaussian process regression models with an isotropic Matern covariance function, which has many applications in spatial statistics. For the model without nugget, we show that when the domain dimension is less than or equal to three, the microergodic parameter and the range parameter are asymptotically independent in the posterior. While the posterior distribution of microergodic parameter is asymptotically normal with a shrinking variance, the posterior distribution of range parameter does not converge to any point mass in general. For the model with nugget, we derive new evidence lower bound and consistent higher-order quadratic variation estimators, which lead to explicit posterior contraction rates for both the microergodic parameter and the nugget parameter. We further study the asymptotic efficiency of Bayesian kriging prediction. All the new theoretical results are verified in numerical experiments and real data analysis.  

Speaker

Cheng Li

Bayesian Non-linear Latent Variable Modeling via Random Fourier Features

The Gaussian process latent variable model (GPLVM) is a popular probabilistic method used for nonlinear dimension reduction, matrix factorization, and state-space modeling. Inference for GPLVMs is computationally tractable only when the data likelihood is Gaussian. Here, we present a method to perform Markov chain Monte Carlo (MCMC) inference for generalized Bayesian nonlinear latent variable modeling. The crucial insight necessary to generalize GPLVMs to arbitrary observation models is that we approximate the kernel function in the Gaussian process mappings with random Fourier features; this allows us to compute the gradient of the posterior in closed form with respect to the latent variables. We show that we can generalize GPLVMs to non-Gaussian observations, such as Poisson, negative binomial, and multinomial distributions, using our random feature latent variable model (RFLVM). Our generalized RFLVMs perform on par with state-of-the-art latent variable models on a wide range of applications. 

Speaker

Michael Zhang

Deep Gaussian Process Surrogates - Big and Small

Deep Gaussian processes (DGPs) upgrade ordinary GPs through functional composition, in which intermediate GP layers warp the original inputs, providing flexibility to model non-stationary dynamics. Recent applications in machine learning favor approximate, optimization-based inference for fast predictions, but applications to computer surrogate modeling demand broader uncertainty quantification (UQ). We prioritize UQ through full posterior integration in a Bayesian scheme, hinging on elliptical slice sampling the latent layers. We demonstrate how our DGP's non-stationary flexibility, combined with appropriate UQ, allows for active learning: a virtuous cycle of data acquisition and model updating that departs from traditional space-filling design and yields more accurate surrogates for fixed simulation effort. But not all simulation campaigns can be developed sequentially, and many existing computer experiments are simply too big for full DGP posterior integration because of cubic scaling bottlenecks. For this case we introduce the Vecchia approximation. We vet DGPs on simulated and real-world examples, and we showcase implementation in the deepgp package for R on CRAN. 

Speaker

Annie Booth, NC State University

Inferring manifolds from noisy data using Gaussian processes

We focus on the study of a noisy data set sampled around an unknown Riemannian submanifold of a high-dimensional space. Most existing manifold learning algorithms replace the original data with lower dimensional coordinates without providing an estimate of the manifold in the observation space or using the manifold to denoise the original data. We propose a Manifold reconstruction via Gaussian processes (MrGap) algorithm for addressing these problems, allowing interpolation of the estimated manifold between fitted data points. The proposed approach is motivated by novel theoretical properties of local covariance matrices constructed from noisy samples on a manifold. Our results enable us to turn a global manifold reconstruction problem into a local regression problem, allowing the application of Gaussian processes for probabilistic manifold reconstruction. In this talk, I will review the classical manifold learning algorithms and discuss the theoretical foundation of the new method, MrGap. Simulated and real data examples will be provided to illustrate the performance. This talk is based on the joint work with David Dunson.  

Speaker

Nan Wu, the University of Texas at Dallas