Amortized Learning for Environmental Data using Neural Networks

Abstract Number:

1052 

Submission Type:

Invited Paper Session 

Participants:

Reetam Majumder (1), Brian Reich (1), Reetam Majumder (1), Amanda Lenzi (2), Brian Reich (1), Raphael Huser (3), Douglas Nychka (4)

Institutions:

(1) North Carolina State University, N/A, (2) N/A, N/A, (3) KAUST, N/A, (4) Colorado School of Mines, N/A

Chair:

Reetam Majumder  
North Carolina State University

Co-Organizer:

Brian Reich  
North Carolina State University

Session Organizer:

Reetam Majumder  
North Carolina State University

Speaker(s):

Amanda Lenzi  
N/A
Brian Reich  
North Carolina State University
Raphael Huser  
KAUST
Douglas Nychka  
Colorado School of Mines

Session Description:

Inference for complex spatiotemporal data can be hampered by computational issues which arise due to data size or model assumptions, often both. For example, Gaussian processes have O(n^3) computational cost, and full likelihood inference for max stable processes, a common model for spatial extremes, is limited to 13 locations. However, data generation from these models tends to be quite cheap. A recent approach for dealing with their intractable nature has been to approximate a part of the statistical model with a neural network trained on synthetic data generated from a dense surface of parameter values. The networks are pre-fitted on the synthetic data and are afterwards cheap to evaluate, and tends to bypass the bottleneck for these models by leveraging the ease of generating samples. For example, one may approximate the expensive likelihood in a Bayesian hierarchical model by a neural network, or bypass the likelihood altogether by directly estimating the model parameters. The general class of approaches that leverage the ease of data generation from the model and use pre-trained neural networks comprise a form of amortized inference, and have gained popularity for a variety of geostatistical problems.

This session aims to showcase the current state of the science for amortized learning in geostatistical problems. Amanda Lenzi will introduce a distributed approach to scaling black-box parameter estimation and inference in large spatial settings. Doug Nychka will present a deep learning approach to accelerate fitting a nonstationary covariance function. Raphael Huser will talk about his work on neural Bayes estimation for censored inference in spatial extremes models. Brian Reich will present a deep learning-Vecchia approximation for inference in spatial regression models. The talks will cover different neural network architectures and their appropriateness in various situations, provide discussion on uncertainty quantification and inference for frequentist and Bayesian neural approaches, and comment on the broader impact amortized learning can have for tackling large-scale geostatistical problems. The broader environmental and climate change communities have embraced neural networks for analyzing massive datasets that that can be easily obtained from climate models (like CMIP6) and remote sensing (like GridMET), and the session's goal is to present amortized learning using neural networks as a scalable approach for the statistical modeling of these complex spatiotemporal data. Our speakers are affiliated with institutions across three continents, and we anticipate that research in this domain will prove crucial towards informing policy surrounding extreme weather and climate change.

Sponsors:

Section on Statistical Computing 2
Section on Statistical Learning and Data Science 3
Section on Statistics and the Environment 1

Theme: Statistics and Data Science: Informing Policy and Countering Misinformation

Yes

Applied

Yes

Estimated Audience Size

Medium (80-150)

I have read and understand that JSM participants must abide by the Participant Guidelines.

Yes

I understand and have communicated to my proposed speakers that JSM participants must register and pay the appropriate registration fee by June 1, 2024. The registration fee is nonrefundable.

I understand