Statistical Applications for Climate and Environmental Data II

Lelys Isaura Bravo de Guenni Chair
University of Illinois At Urbana-Champaign
 
Thursday, Aug 7: 10:30 AM - 12:20 PM
4228 
Contributed Papers 
Music City Center 
Room: CC-105B 

Main Sponsor

Section on Statistics and the Environment

Presentations

A Kernel-Based Approach For Photovoltaic Power Prediction in South Africa

Photovoltaic (PV) solar power generation represents a viable option for meeting increased electricity demand. Accurate solar power predictions are crucial for feasibility studies of new installations and successful integration of PV systems into existing power grids. This need is particularly acute in South Africa, where the expansion of renewable energy capacity must be balanced against grid stability. This study applies kernel-based approaches to PV power prediction, focusing on capturing multi-scale temporal patterns in solar power generation. Using data from a large-scale PV installation in South Africa's Northern Cape region, the study investigates how kernel ridge regression can model both the inherent periodicity of solar power generation and its weather-dependent variations. The methodology addresses the complex interplay between weather patterns, seasonal variations, and power generation. The research contributes to the practical advancement of PV power prediction in renewable energy applications, with direct implications for grid integration and operational planning in regions with significant solar power installations. 

Keywords

Photovoltaic power prediction

kernel methods

machine learning

renewable energy

applied statistics 

Co-Author(s)

Chantelle Clohessy, Nelson Mandela University
Mpho Mpofu, Nelson Mandela University

First Author

Stefan Janse van Rensburg, Nelson Mandela University

Presenting Author

Stefan Janse van Rensburg, Nelson Mandela University

Correcting Precipitation Forecast Displacement Errors Using Machine Learning

In meteorological forecasting, convection-allowing grid-spacing models have significantly improved the simulation of heavy rainfall associated with warm-season convection. However, substantial errors in precipitation location persist, posing challenges for critical applications such as flood prediction. In this study, we develop machine learning (ML) tools to correct displacement errors in High-Resolution Ensemble Forecast (HREF) members using detailed mesoscale weather data from the Storm Prediction Center. The Method for Object-based Diagnostic Evaluation (MODE) was employed to identify key precipitation object characteristics, which served as inputs for ML models designed to refine centroid location errors in mesoscale convective systems across eight HREF ensemble members. Trained on data from 2018 to 2023, the models were tested in real-time during the 2024 Flash Flood and Intense Rainfall experiments. The best-performing ML model achieved an average reduction of 35–51% in storm centroid location error over the original HREF forecasts, demonstrating its potential for enhancing flood prediction accuracy. 

Keywords

Mesoscale convective systems

Quantitative precipitation forecast

Mesoscale weather data

Great-circle distance

Machine learning postprocessor

Probability matched mean 

Co-Author(s)

Tyreek Frazier, Iowa State University
Somak Dutta, Iowa State University
William A. Gallus, Jr., Iowa State University
Kristie J. Franz, Iowa State University

First Author

Aniruddha Pathak, Iowa State University

Presenting Author

Aniruddha Pathak, Iowa State University

Ice Model Calibration using Diffusion Models

Rapid changes in the cryosphere can affect climate change, such as global sea-level rise. Computer models are useful for understanding the behavior of Antarctic ice sheets and can be used to study their impact on rising sea levels. However, uncertainty quantification of model parameters is challenging because the model outputs and observations are high-dimensional and spatially correlated. Furthermore, they are semicontinuous with an excess of zeros. To address these challenges, we propose a diffusion model-based emulator that can accurately generate the pseudodata across various parameter settings. Since the resulting likelihood from the emulator is intractable, we propose an approximate Bayesian computation method with a Siamese network. The Siamese network is trained to determine whether images generated by the emulator with proposed parameters closely resemble observational data based on the similarity of their features. We apply our method to calibrate the computer model for the West Antarctic Ice Sheet data to generate future projections of sea level rise based on modern ice sheet observations, where the current approaches are infeasible due to the aforementioned challenges. 

Keywords

Ice model calibration

diffusion model

approximate Bayesian computation

Siamese network

semicontinuous spatial data 

Co-Author(s)

Won Chang, Seoul National University
Jaewoo Park, Yonsei University, Department of Applied Statistics

First Author

Kanghyun Wi, Yonsei University

Presenting Author

Kanghyun Wi, Yonsei University

Inside the Heat Dome: analyzing heatstroke hospitalizations using Bayesian synthetic control with spatially augmented priors

The synthetic control (SC) method is widely used to estimate causal effects using panel data. However, the classical SC framework does not account for spatial dependence and spill-over effects common when observational units represent spatial entities such as cities, counties, or regions. Spatial correlation and latent spatial confounding can bias estimates, yet little research has addressed these issues systematically, and simulation studies in this context remain scarce.

We propose the spatially-augmented Bayesian synthetic control (SA-BSC), which integrates geographic distance into spike-and-slab priors on donor weights. Two specifications are available: distance-to-binary (D2B), where a control unit's inclusion probability decays with distance, and distance-to-variance (D2V), which exponentially shrinks the prior variance of distant donors. Using this approach, we can encompass additional information into the synthetic control estimation, leveraging the flexibility of semiparametric spatial priors for weights estimation. Through extensive simulations varying the pre-treatment window length, spatial autocorrelation, and magnitude of spill-over effects, we find that SA-BSC substantially reduces root-mean-squared error and improves posterior-interval coverage compared to standard non-spatial synthetic control methods.

We illustrate the application of SA-BSC with a large-scale observational study examining acute heat-stroke hospitalizations among an open cohort of fee-for-service Medicare beneficiaries in the contiguous United States, covering 34.5 million individuals from 2000 to 2016. Daily maximum heat-index data are linked to residential ZIP codes, defining heat waves as periods of two or more consecutive days exceeding the local 95th percentile. Each exposed ZIP-day constitutes a treated unit, with counterfactual donors constructed from contemporaneously unexposed ZIP codes. SA-BSC provides spatially coherent counterfactual outcomes and robust, interpretable causal estimates, highlighting its value for observational studies with complex spatial structures. 

Keywords

Gun violence

Heatwaves

Causal inference

Spatial statistics

Synthetic controls

Environmental health 

Co-Author(s)

Giulio Grossi
Falco J. Bargagli Stoffi, University of California, Los Angeles (UCLA)
Francesca Dominici, Harvard School of Public Health

First Author

Leo Vanciu, Harvard University

Presenting Author

Leo Vanciu, Harvard University

Random Elastic Space-Time (REST) Prediction and Solar Irradiance Studies

As the power grid moves to a more renewable future, energy sources from weather-driven phenomena such as solar power will form an increasingly large portion of electricity generation.  The variability, non-Gaussianity and intermittency of solar resources challenge current grid operation paradigms, and realistic data scenarios are required for grid planning and operational studies.  However, such data are not available at the space-time resolution needed for realistic grid models.  Given sparse spatial samples, we introduce a framework for spatiotemporal prediction in a functional data analysis framework when data exhibit nonstationary phase misalignment.  The approach is illustrated on a challenging high-frequency irradiance dataset and compared with existing methods. 

Keywords

curve registration


distributed photovoltaic systems

functional data analysis

spatiotemporal prediction 

Co-Author

William Kleiber, University of Colorado

First Author

Nicolas Coloma

Presenting Author

Nicolas Coloma

Stochastic spatial stream networks for scalable inferences of riverscape processes

Spatial stream networks (SSN) models characterize correlated ecological processes in dendritic ecosystems. Conventional SSN models rely on pre-processed stream networks and point-to-point hydrologic distances. However, this data processing may be labor-intensive and time-consuming over large spatial domains. Therefore, we propose to infer the functional connectivity of stream networks stochastically. Our physically-guided model utilizes the knowledge that water flows from high elevation to low elevation, and flow rate increases when two tributaries merge. We also leverage the hierarchical branching architecture of dendritic networks to alleviate computing and reduce uncertainty. Spatial autoregressive models composed of inferred SSNs propagate stochasticity between network connectivity and population dynamics in a Bayesian framework. We show in simulated examples that our mechanistic model facilitated learning about the functional network and enhanced predictive performance. We also demonstrate our approach in a large-scale case study using native brook trout (Salvelinus fontinalis) count data. 

Keywords

Bayesian hierarchical models

Markov random field

population models

space-time dynamics 

Co-Author(s)

Andee Kaplan, Colorado State University
Mevin Hooten, The University of Texas At Austin
Yoichiro Kanno, Colorado State University
Jacob Rash, North Carolina Wildlife Resources Commission
George Valentine, USDA

First Author

Xinyi Lu

Presenting Author

Xinyi Lu