Tuesday, Aug 6: 8:30 AM - 10:20 AM
5090
Contributed Papers
Oregon Convention Center
Room: CC-B111
Main Sponsor
Section on Statistics and the Environment
Presentations
We propose a new method to adjust for the bias that occurs when citizen scientists monitor a fixed location and report whether an event of interest has occurred or not, such as whether a plant has bloomed. The bias arises as monitors note whether the event has happened upon arrival, lacking the precise day of occurrence. Adjustment is important because differences in monitoring patterns can make local environments appear more or less anomalous than they actually are, and the bias may persist when the data are aggregated across space or time. To correct for this bias, we propose a nonparametric Bayesian model that uses monotonic splines to estimate the distribution of bloom dates at different sites. We then use our model to determine whether the lilac monitored by citizen scientists in the northeast US bloomed anomalously early or late, preliminary evidence of environmental stress caused by climate change. Our analysis suggests that failing to correct for monitoring bias would underestimate the peak bloom date by 32 days on average. In addition, after adjusting for monitoring bias, several locations have anomalously early bloom dates that did not appear anomalous before adjustment.
Keywords
nonparametric Bayes
monotonic splines
monitoring bias
bias correction
crowdsourcing
climate change
Accurate crop modeling is essential to maintain food security in the United States, especially in the face of ongoing climate change. As climate conditions change regionally, it will be essential to understand both marginal and interaction effects of climate variables on crop yield.
We present a Bayesian Spatially-Varying Functional Regression model for crop yield. This model looks at the combined effect of two climate variables on crop yield. Previous models have either modeled the combined effect as a scalar term, or have used functional regression but neglected the interaction. The estimated weight functions from the functional regression are used to identify periods of vulnerability during different crop growth stages. We also account for spatial variability in modeling climate variable effects on yield.
We apply our model to recent crop yield data in the midwestern United States using the climate variables of vapor pressure deficit and soil moisture. We demonstrate the performance and interpretability of the model and compare to other crop yield models.
Keywords
spatial statistics
functional data analysis
Bayesian statistics
crop yield modeling
Abstracts
Environmental data are often observational and exhibit spatial dependence, making causal effects of treatments or policies difficult to estimate. Unmeasured spatial confounders, i.e., spatial processes correlated with both the treatment assignment mechanism and the outcome, can introduce bias when estimating causal effects of interest since important assumptions in causal inference are violated. Spatial data can also be subject to preferential sampling, where sampling of locations are related to unmeasured confounders or the response variable, which introduces additional bias to the estimation of model parameters. We propose a spatial causal inference method that simultaneously accounts for unmeasured spatial confounders in both sampling locations and treatment allocation. We prove the identifiability of key parameters in the model and the consistency of the posterior distributions of those parameters. We also show via simulation studies that the causal effect of interest can be reliably estimated under the proposed model. The proposed method is applied to assess the effect of policies that govern marine protected areas on fish biodiversity.
Keywords
Poisson process
Preferential sampling
Spatial confounding
Potential outcomes
The increasing volumes of species observation data being collected by citizen-science projects around the world have great potential for monitoring populations and helping to identify the drivers of population change. However, to realize this potential requires methods that can 1) estimate heterogenous patterns of population change that arise when multiple drivers (e.g. change in land use and climate) affect species populations simultaneously, and 2) control for confounding sources of inter-annual variation common in citizen science datasets. In this presentation we investigate the use of machine learning-based estimators designed for Conditional Average Treatment Effect (CATE) estimation (including Causal Forests and meta-learners) to address these challenges. Using a simulation study and data from the citizen-science project eBird, we assess performance estimating spatially varying trends in population size and identifying drivers of population change in the face of real-world confounding. We discuss results showing how this approach can recover heterogenous trends and discuss outstanding challenges.
Keywords
Biodiversity monitoring, Conservation, Ecology, Causal machine learning, Double machine learning, Spatiotemporal, Species distribution modelling
First Author
Daniel Fink, Cornell Lab of Ornithology
Presenting Author
Daniel Fink, Cornell Lab of Ornithology
Concern regarding climate change and its influential impact on humanity is the talk of the hour. Air pollutant levels in air are constantly monitored, and we use the United States Environmental Protection Agency's available resources to access the distribution of particular pollutants for a given number of sites, over the years. Various spatial locations have their spatially dependent pollutant′s quantile functions which varies with time. Using an approach of simultaneously modelling the quantiles, our aim is to reduce the computational complexity than the existing methodologies. We use a quantile regression method that uses functional principal components to reduce the dimensions over space and quantile levels while testing for trends in air pollution data over the last 20 years. Extensive comparison among the existing methods in literature is demonstrated.
Keywords
spatial
quantile
regression
functional
computation
pollutants
We propose a latent spatio-temporal causal model for a class of causal estimands that go beyond the conditional expectation. In particular, we focus on estimands for contemporaneous and lagged effects that serve as descriptors of the tail behaviour of the predictive distribution of the underlying spatio-temporal process. Under mild sufficient conditions, we theoretically validate the correctness of causal interpretation and further prove: i) the identifiability of causal effects using the full observational distribution; and ii) the consistency of our model estimator. We provide a simulation study to illustrate the correctness of our asymptotic consistency theorem and showcase the advantages of using a causal estimand, that focuses on the tails, over the traditional conditional expectation. Finally, we apply our framework to quantify causal spatio-temporal structures in U.S. wildfire and air quality data.
Keywords
air quality data
causal inference
extreme event
spatio-temporal process
tail-descriptive estimand
wildfire data
Abstracts