Advances in Statistical Modeling and Inference for Complex Physical Systems

Jonathan Stallrich Chair
North Carolina State University
 
Tuesday, Aug 5: 8:30 AM - 10:20 AM
4084 
Contributed Papers 
Music City Center 
Room: CC-210 
This session brings together research on statistical modeling tools for complex systems and new insights on inference and uncertainty quantification for problems in the physical and engineering sciences.

Main Sponsor

Section on Physical and Engineering Sciences

Presentations

Efficient Gaussian Process Modeling for Replicated and Censored Data in Manufacturing Applications

Gaussian process (GP) regression is widely used for modeling responses in both physical and computer experiments. In practice, data from physical experiments may include replications and be subject to censoring due to equipment limitations or contextual constraints. Motivated by real-world manufacturing data, we develop a GP modeling framework that efficiently analyzes replicated and potentially censored data. Our approach leverages recent advances in likelihood-based inference and latent-variable methods for GPs, effectively exploiting replication while rigorously incorporating censoring information-analogous to exact simulation techniques for truncated distributions. We demonstrate the effectiveness of our method on synthetic manufacturing data, showing that it provides enhanced prediction, uncertainty quantification, and computational scalability, making it well-suited for large-scale applications with structured replication and incomplete data. 

Keywords

Censored data

Gaussian process

Replication

Truncated multivariate normal 

Co-Author(s)

Emily Kang, University of Cincinnati
Shuo Li, Procter & Gamble

First Author

Ying Zhang

Presenting Author

Ying Zhang

Two-Step Bayesian Estimation of Sparse Dynamical Systems Using Data-Driven Closure Models

Many problems in science and engineering involve observing sample trajectories from processes with partially known dynamics. Often the systems
are driven by an ODE with nonlinear dependence between elements of the state vector, and require solutions require expert knowledge to build mathematical
models. Recently, a growing literature has explored solving such systems using machine learning, with results suggesting data-driven modeling can
facilitate faster discovery. However, sparsity, missing data, and observation error limit existing methods.

We propose to address these challenges in two stages: First, we address sparsity and observation errors using data assimilation (DA) to in-fill trajectories of observed data. Second, we estimate dynamics using an MCMC-based model based on spline estimation and a sparse prior for the dynamics of the ODE. We also strengthen the sparse prior of the MCMC model with additional knowledge of the closure terms to improve model estimation. Considering the DA trajectories as additional priors, we average over DA ensembles to improve forecasting. 

Keywords

Spline estimation

Gaussian processes

Koopman operators

dynamical systems

missing data

uncertainty quantification 

Co-Author(s)

Toryn Schafer, Texas A&M University
Kyle Neal, Sandia National Laboratories
Moe Khalil, Sandia National Laboratories
Teresa Portone, Sandia National Laboratories
Rileigh Bandy, Sandia National Laboratories

First Author

Daniel Drennan, Department of Statistics, Texas A&M University

Presenting Author

Daniel Drennan, Department of Statistics, Texas A&M University

Ionospheric Observations from the ISS: Overcoming Noise Challenges in Signal Extraction

The Electric Propulsion Electrostatic Analyzer Experiment (ÈPÈE) is a compact ion energy bandpass filter deployed on the International Space Station (ISS) in March 2023 and providing continuous measurements through April 2024. This period coincides with the Solar Cycle 25 maximum, capturing unique observations of solar activity extremes in the mid- to low-latitude regions of the topside ionosphere. Derived plasma parameters from in-situ measurements enhance understanding of local space weather and its impact on satellite navigation, communication, and GPS accuracy. We present a statistical pipeline for processing ÈPÈE data, addressing challenges such as instrument noise floor, temporal data density, and signal extraction. Unlike traditional methods that discard data due to noise, our approach learns a baseline noise and fits the surface using a scaled Vecchia Gaussian Process, enabling recovery of previously discarded values and creating possibilities for noise-assisted ionospheric monitoring. 

Keywords

signal processing

Gaussian Processes

background estimation

time series

astrophysics

Ionospheric science 

Co-Author(s)

Kelly Moran, Los Alamos National Laboratory
Carlos Maldonado, Los Alamos National Laboratory

First Author

Rachel Ulrich

Presenting Author

Rachel Ulrich

Monotonic warpings for additive and deep Gaussian processes

Gaussian processes (GPs) are canonical as surrogates for computer experiments because they enjoy a degree of analytic tractability. But that breaks when the response surface is constrained, say to be monotonic. Here, we provide a "mono-GP" construction for a single input that is highly efficient even though the calculations are non-analytic. Key ingredients include transformation of a reference process and elliptical slice sampling. We then show how mono-GP may be deployed effectively in two ways. One is additive, extending monotonicity to more inputs; the other is as a prior on injective latent warping variables in a deep Gaussian process for (non-monotonic, multi-input) non-stationary surrogate modeling. We provide illustrative and benchmarking examples throughout, showing that our methods yield improved performance over the state-of-the-art on examples from those two classes of problems. 

Keywords

computer experiment

surrogate modeling

constrained response surface

elliptical slice sampling

uncertainty quantification

Bayesian inference 

First Author

Steven Barnett, Virginia Tech

Presenting Author

Steven Barnett, Virginia Tech

WITHDRAWN: A scalable Bayesian approach to spectral line detection and galaxy redshift estimation

Estimating galaxy redshifts is crucial for constraining key physical quantities like dark energy. Modern spectroscopic telescopes such as the James Webb Space Telescope (JWST) are producing massive amounts of high-resolution data that enable precise redshift estimation. However, this is only possible when spectral lines are present in the data, which is not known a priori. We adopt a fully Bayesian approach to estimate redshift, using Bayes factors to test for multiple spectral lines. The main challenge is computational, as the known physical constraints between redshift and spectral line signal intensities lead to a highly multimodal posterior distribution. To address this, we develop a fast Laplace approximation-based method that explicitly accounts for multimodality and apply it to new JWST spectra. 

Keywords

Bayes factors

Astrostatistics

Laplace approximation 

Co-Author(s)

Bonnabelle Zabelle, University of Minnesota
Sara Algeri, University of Minnesota
Galin Jones, University of Minnesota
Claudia Scarlata, University of Minnesota

First Author

Alexander Kuhn, University of Minnesota

Presenting Author

Alexander Kuhn, University of Minnesota

Neural Posterior Estimation for Inferring Weak Lensing Shear and Convergence from Pixels

Inferring the distortion of imaged galaxies due to weak gravitational lensing is a challenging inverse problem involving pixelization, instrument bias, and a low signal-to-noise ratio. Most traditional approaches to this task produce point estimates of weak lensing shear and convergence by measuring, averaging, and calibrating galaxy ellipticities, a multistage procedure that is subject to image noise, selection bias, and model misspecification. As an alternative, we propose a Bayesian method for weak lensing inference that jointly estimates shear and convergence maps from multiband images using a type of likelihood-free amortized variational inference called neural posterior estimation (NPE). NPE is computationally efficient due to its utilization of deep neural networks and implicit marginalization of nuisance latent variables, and it provides estimates of posterior uncertainty that can be propagated to downstream cosmological analyses. When evaluated on synthetic images from the LSST-DESC DC2 Simulated Sky Survey, the proposed algorithm produces posterior shear and convergence maps that are well-calibrated and consistent with the ground truth. 

Keywords

neural posterior estimation

weak gravitational lensing

likelihood-free inference

variational inference

cosmology

astronomical images 

Co-Author(s)

Shreyas Chandrashekaran, University of Michigan
Camille Avestruz, University of Michigan
Jeffrey Regier, University of Michigan

First Author

Timothy White, University of Michigan

Presenting Author

Timothy White, University of Michigan

Statistical Emulators for Inferring Planet Formation Conditions

Exploring the full parameter space of planet formation conditions and processes is computationally impractical as each simulation can take weeks to complete, and the parameter space remains vast. In this work, we propose a framework to infer planet formation conditions while reducing computational costs. Our approach accounts for intrinsic variations in conditions and the stochastic nature of outcomes within a given set of conditions. We employ statistical emulators to model the relationship between planet formation parameters - such as solid normalization, radial distribution of solids, and gas disk depletion - and key observables, including period ratio, transit multiplicity, transit ratio, and hill spacing. Since these observables are inherently stochastic and represented by probability distributions, we first map the stochastic outputs to a reduced-dimensional space. We then use Gaussian processes (GP) to model the relationships within this reduced space. Once the emulators are trained on existing simulation data, we apply a Bayesian modular approach to infer the underlying parameters. Fast GP predictions within the likelihood ensure computationally feasible inference. 

Keywords

Emulators

Gaussian Process

Stochastic Simulator

Astrostatistics

Planet formation 

First Author

Anirban Mondal, Case Western Reserve University

Presenting Author

Anirban Mondal, Case Western Reserve University