Wednesday, Aug 6: 8:30 AM - 10:20 AM
4142
Contributed Papers
Music City Center
Room: CC-214
This session explores innovative approaches to scientific modeling and decision-making across diverse fields, with a focus on the integration of machine learning, uncertainty quantification, and advanced computational techniques. Presentations will highlight real-time forecasting applications, predictive models for complex systems, and data-driven solutions in material science, molecular structures, and environmental phenomena.
Main Sponsor
Section on Statistics in Defense and National Security
Co Sponsors
Section on Statistics in Defense and National Security
Presentations
Deep models are powerful tools that are increasingly used in analyses of complex/multimodal data. Many attempts at producing estimates with associated uncertainty are ad hoc at best, with only a relative notion of uncertainty. Statistically valid uncertainty quantification (UQ) to accompany deep model predictions is necessary to gain traction for using these methods on high risk applications where expensive or important decisions hinge on the results. Conformal prediction promises statistically valid intervals (assuming exchangeable data), but UQ does not vary according to local difficulty of the problem. We propose an extension of conformal prediction to computer vision to provide UQ for a deep model with complex/multimodal inputs and explore methods to provide local adaptivity beyond simple continuous inputs for an image input application.
Keywords
Conformal prediction
Uncertainty quantification
Deep learning
Computer vision
The Department of Defense has realigned its approach to data management. Prior to 2020, data was viewed as a strategic risk for the Department. Now it is seen as a strategic asset that will position the Department for joint all-domain operations and artificial intelligence applications. Commensurate with Department-level data policy, Director, Operation Test and Evaluation has published new policy that test programs shall create data management plans to make test data VAULTIS (visible, accessible, understandable, linked, trustworthy, interoperable, and secure). In this briefing, I will motivate the necessity for testers to take an intentional approach to data management, tour the new policies for test data, and provide an overview of the Data Management Plan Guidebook – an approach to planning for test data management that is in line with DOD's VAULTIS framework.
Keywords
Data Management
System Testing
Test Planning
Department of Defense
Operational Testing
Computationally expensive numerical solutions of partial difference equation models (PDE)s are critical in understanding complex real-world phenomena. However, the utility of these numeric simulations in many applications is limited by the computational cost of running these models over a breadth of initial conditions necessary to characterize the space of feasible solutions. We propose a novel surrogate framework RANDPROM that combines cutting-edge reduced-order approximations of PDEs within a Bayesian hierarchical model that enables accurate and precise predictions of the PDE solutions at initial conditions which are not included in training data. Using simulations of tsunami wave height as an example dataset, we demonstrate the potential for RANDPROM to produce near real-time predictions of wave height which demonstrates the potential of the RANDPROM surrogate framework to contribute to real-world decision making.
Keywords
PDE surrogates
Hierarchical models
Uncertainty quantification
Machine learning and statistical models in chemistry offer the promise of aiding material identification and discovery from experimental measurements, but developing models that are appropriate for chemistry data (including molecular structures and various experimental signatures) can be challenging. In this work, we investigate machine learning approaches that link molecular structures to properties with the ultimate goal of predicting molecular structure from experimental measurements (e.g., nuclear magnetic resonance spectra). We demonstrate the ability of convolutional autoencoder neural networks to represent spectral data and present results on the utility of molecular structure embeddings for downstream tasks.
Keywords
machine learning
chemistry
Lightning strikes can cause profound damage to property and life. Strikes to the power grid can cause severe damage, and dry lightning can easily start fires. Real time forecasting of lightning provides emergency services a tool to mitigate these risks. Using data from the Worldwide Lightning Location Network (WWLLN), we construct two models to predict future lightning: a generalized linear model (GLM) and a convolutional neural network paired with long short-term memory (CNN-LSTM). We focus on providing forecasts that rely on no external real-time information except the presence of lightning in the recent past. Therefore, both models leverage s patial and temporal correlations as their primary source of predictive power, rather than traditional spatio-temporal covariates. This ensures a robust model that can produce forecasts quickly even if other data streams, such as cloud data, are not readily available. These models provide a global heatmap of forecasted probabilities in addition to forecasted intensities of lightning. Model performances are evaluated against a rolling held out test set to understand performance variations over seasons.
Keywords
lightning
convolutional neural network
generalized linear model
The United States Air Force utilizes methodologies in MIL-HDBK-1823A as a standard to enable Probability of Detection (POD) studies for Nondestructive Evaluation, as means to aid the evaluation of airframe integrity. The primary metric of interest in these studies is the upper bound of the 95% confidence interval associated with the 0.90 POD, also known as a90|95, the largest detectable flaw size for this probability. Therefore, to estimate a90|95, data is collected across a wide range of flaw sizes in order to build a curve with enough fidelity to estimate this bound. As such, sample size becomes an important consideration in these studies due to the associated cost and required resources. This work examines current and extended contemporary methods for POD estimation with respect to sample size requirements. Specifically, we present and compare formulas for POD estimation utilizing one methodology outlined in MIL-HDBK-1823A: 1) a vs a-hat, in addition to 2) methods for dependent data. We conclude by comparing sample size requirements for these methods across a range of data characteristics inspired by typical POD studies.
Keywords
Probability of Detection
Nondestructive Evaluation
flaw size
a-hat vs a
hit/miss
sample size
Modeling the equation of state (EOS) of chemically dissociating materials at extreme temperature and density conditions is necessary to predict their thermodynamic behavior in simulations and experiments. However, this task is challenging due to sparse experimental and theoretical data needed to calibrate the parameters of the equation of state model, such as the latent molar mass surface. In this work, we adopt semi-parametric models for the latent molar mass of the material and its corresponding free energy surface. Our method employs basis representations of the latent surfaces with regularization to address challenges in basis selection and prevent overfitting. We show with an example involving carbon dioxide that our method improves model fit over simpler representations of the molar mass surface while preserving low computational overhead. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-ABS-872125
Keywords
Semi-Parametric
Uncertainty Quantification
Inverse Problem
Material Science