ENVR Student Paper Competition

Soutir Bandyopadhyay Chair
Colorado School of Mines
 
Emily Kang Organizer
University of Cincinnati
 
Sunday, Aug 3: 4:00 PM - 5:50 PM
0610 
Topic-Contributed Paper Session 
Music City Center 
Room: CC-106A 
Presentations by the winners of the ENVR student paper competition.

Keywords

Spatial statistics

Bayesian modeling

Neural networks

Spatial extremes 

Applied

No

Main Sponsor

Section on Statistics and the Environment

Presentations

A variational neural Bayes framework for inference on intractable posterior distributions

Classic Bayesian methods with complex models are frequently infeasible due to an intractable likelihood. Simulation-based inference methods, such as Approximate Bayesian Computing (ABC), calculate posteriors without accessing a likelihood function by leveraging the fact that data can be quickly simulated from the model, but converge slowly and/or poorly in high-dimensional settings. In this paper, we propose a framework for Bayesian posterior estimation by mapping data to posteriors of parameters using a machine learning model trained on data simulated from the complex model. Posterior distributions of model parameters are efficiently obtained by assuming a parametric form for the posterior, parametrized by the machine learning model, which is trained with the simulated observed data as inputs and the associated parameters as outputs. We show theoretically that our posteriors converge to the true posteriors in Kullback-Leibler divergence if the correct parametric family of the posterior is identified. We also provide tools to help us identify if our parametric assumption is close to the true posterior, and modeling options if that is not the case. Comprehensive simulation studies highlight our method's robustness and accuracy. 

Keywords

Simulation-based Inference

Emulator

Spatial Epidemiology

Spatial Extreme Models

Variational Inference

Approximate Bayesian Computing 

Co-Author(s)

Emily Hector, North Carolina State University
Amanda Lenzi
Brian Reich, North Carolina State University

Speaker

Elliot Maceda

Bayesian model-based decomposition reveals spatially varying temporal shifts in streamflow profiles across north temperate US Rivers

Anthropogenically forced climate shifts disrupt the seasonal behavior of climatic and hydrologic processes. The seasonality of streamflow has significant implications for the ecology of riverine ecosystems and for meeting societal demands for water resources. We develop a hierarchical Bayesian model of daily streamflow to quantify how the shape of seasonal hydrographs are changing and to evaluate temporal trends in model-based hydrologic indices related to flow timing and magnitude shifts. We apply this model to 1,112 gages across the Northern US over the years 1965-2022. We identify large-scale patterns in temporal changes to streamflow profiles that are consistent with regional changes in hydroclimate, including decreasing seasonal flow variability in the Pacific Northwest and increasing winter flows in the northeastern US. Within these regions we also observe fine-scale heterogeneity in streamflow timing and magnitude shifts, both of which have potentially significant implications for riverine ecosystem function and the ecosystem services they provide. 

Keywords

Streamflow

Climate

Bayesian 

Co-Author(s)

Tyler Wagner, USGS
Erin Schliep, North Carolina State University
Christopher Wikle, University of Missouri

Speaker

Kevin Collins

Deep Compositional Spatial Models for Nonstationary Extremal Dependence

Modeling the nonstationarity that often prevails in extremal dependence of spatial data can be challenging. Inference for stationary and isotropic models is considerably easier, but the assumptions that underpin these models are not typically met by data observed over large or topographically complex domains. A simple approach to accommodating spatial nonstationarity under the assumption of Gaussianity is to warp the original spatial domain to a latent space where stationarity and isotropy can be reasonably assumed and has since seen further developments in the classical Gaussian-based geostatistics and spatial extremes contexts. However, estimation of the warping function can be computationally expensive, and the transformation is not always guaranteed to be injective, which can lead to physically unrealistic transformations. We present a deep compositional model to capture nonstationarity in extremal dependence in exceedances of data functionals by leveraging efficient inference methods for r-Pareto processes. A detailed high-dimensional simulation study demonstrates the superior performance of our model in estimating the warped space, leading to an accurate characterization of the highly nonstationary extremal dependence structure. We apply the proposed approach to UK precipitation data, where we efficiently estimate the extremal dependence pattern with data observed at thousands of locations, which has never been achieved in previous relevant studies. The model is programmed with the R language and tensorflow v2. 

Keywords

Deformation

Nonstationarity

Deep Models

Spatial Extremes

r-Pareto Processes 

Co-Author(s)

Jordan Richards, University of Edinburgh
Raphael Huser, KAUST

Speaker

Xuanjie Shao, KAUST

GS-BART: Bayesian Additive Regression Trees with Graph-split Decision Rules for Generalized Spatial Nonparametric Regressions

Ensemble decision tree methods such as XGBoost, random forest, and Bayesian additive decision trees (BART) have gained enormous popularity in data science for their superior performance in machine learning regression and classification tasks. In this paper, we develop a new Bayesian graph-split-based additive decision trees method, called GS-BART, to improve the performance of BART for spatially dependent data. The new method adopts a highly flexible split rule complying with spatial structures to relax the axis-parallel split rule assumption adopted in most existing ensemble decision tree models. We consider a generalized spatial nonparametric regression model using GS-BART and design a scalable informed MCMC algorithm to sample the decision trees of GS-BART, which apply to both point referenced and areal unit data as well as Gaussian and non-Gaussian responses. The algorithm leverages a gradient-based recursive algorithm on root directed spanning trees or chains (called arborescences) The superior performance of the method over conventional ensemble tree models and Gaussian process regression models is illustrated in various spatial data analysis. 

Keywords

Bayesian Nonparametric Regression

Complex Domain

Decision Trees

Informed MCMC

Spatial Prediction

Spanning Tree 

Co-Author(s)

Huiyan Sang
Quan Zhou, Texas A&M University

Speaker

Shuren He, Texas A&M University

Modeling Spatial Extremes using Non-Gaussian Spatial Autoregressive Models via Convolutional Neural Networks

Data derived from remote sensing or numerical simulations often have a regular gridded structure and are large in volume. However, it is challenging to find accurate spatial models that can fill in missing grid cells or simulate the process effectively, especially when there is spatial heterogeneity and heavy-tailed marginal distributions. One effective method is to use a spatial autoregressive (SAR) model, which maps a location and its neighbors to spatially independent random variables. This model is flexible and well-suited for non-Gaussian fields. In this study, we assume that the innovations in the SAR model follow a Generalized Extreme Value (GEV) distribution, a heavy-tailed distribution, and incorporate nonlinear maps that combine a central grid location with its neighbors, introducing extreme spatial behavior. While these models are fast to simulate due to the sparseness of the construction, the estimation process is slow because the likelihood is intractable. To overcome this, we suggest training a convolutional neural network (CNN) on a large training set that covers a useful parameter space and then using the trained network for fast estimation. We apply this model to analyze yearly maximum precipitation data from a regional climate model to study spatial extremal behavior across North America. 

Keywords

Spatial Autoregressive Model

Generalized Extreme Value Distribution

Convolutional Neural Networks

Parameter Estimation

Spatial Extremes

Quantile Regression 

Co-Author(s)

Douglas Nychka, Colorado School of Mines
Soutir Bandyopadhyay, Colorado School of Mines

Speaker

Sweta Rai