Monday, Aug 4: 8:30 AM - 10:20 AM
4042
Contributed Papers
Music City Center
Room: CC-104D
Main Sponsor
Section on Statistical Computing
Presentations
Parameter estimation for nonlinear dynamic system models, represented by ordinary differential equations (ODEs) or partial differential equations (PDEs), using noisy and sparse experimental data is a vital task in many fields. We propose a fast and accurate method, physics-informed Gaussian process, for this task. Our method uses a Gaussian process model over system components, explicitly conditioned on the physics information that gradients of the Gaussian process must satisfy the ODE/PDE system. By doing so, we completely bypass the need for numerical integration and achieve substantial savings in computational time. Our method is also suitable for inference with unobserved system components and provides uncertainty quantification. Our method is distinct from existing approaches as we provide a principled statistical construction under a Bayesian framework, which rigorously incorporates the ODE/PDE system through conditioning.
Keywords
Physics-Informed Gaussian Process
Non-linear Differential Equations
Bayesian Inference
First Author
Shihao Yang, Georgia Institute of Technology
Presenting Author
Shihao Yang, Georgia Institute of Technology
In this paper, a multinomial probit model is proposed to examine a categorical response variable, with the main objective being the identification of the influential variables in the model. To this end, a Bayesian selection technique is employed featuring two hierarchical indicators, where the first indicator denotes a variable's relevance to the categorical response, and the subsequent indicator relates to the variable's importance at a specific categorical level, which aids in assessing its impact at that level. The selection process relies on the posterior indicator samples generated through an MCMC algorithm. The efficacy of our Bayesian selection strategy is demonstrated through both simulation and an application to a real-world example.
Keywords
Indicator
Componentwise Gibbs sampler
MCMC algorithm
Median probability criterion
Analyzing the covariances of modern datasets has become increasingly difficult with the growing size and often large time complexity of covariance estimation techniques. For data which admit a multiway structure, the tensor normal model has become ubiquitous in modeling the covariance due to its ability to model covariances through smaller sequential operations mode-wise. However, the structural assumptions required by the tensor normal necessarily limit the flexibility of the covariance. Such limitations on flexibility have been tested in spatio-temporal contexts and found to be impractical in some cases.
In this work, we consider a Cholesky factor parametrization for the precision matrix of a tensor normal which directly relaxes one of the structural assumptions of the tensor normal distribution's covariance without loss of analytic tractability of the likelihood. We connect this parametrization with the Log-Cholesky Riemannian metric's Frechet Mean, and use this parametrization to then construct a hierarchical empirical Bayes model for relevancy detection in the sum of Kronecker products model.
Keywords
Hamiltonian Monte Carlo
Riemannian Geometry
Separable Covariance
matrix normal distribution
Pitsianis - Van Loan
decomposition
Identifying Granger causality in high-dimensional time series is crucial for understanding their complex dependence structures and improving forecasting accuracy, particularly in fields such as finance and neuroscience. In this work, we propose a novel deep state-space model in which state transitions are jointly modeled using a deep neural network, while the measurement equation remains linear to facilitate downstream analysis. To efficiently handle long-term high-dimensional time series, we develop a scalable Bayesian deep particle filtering algorithm that tracks latent states and uncovers the temporal dependencies between time series. We establish the convergence properties of the proposed algorithm, ensuring its theoretical soundness. Our method offers a principled approach to discovering causal relationships in challenging high-dimensional time series applications. We demonstrate its effectiveness through both simulated data and real-world applications, including the one-minute log returns of Nasdaq stocks.
Keywords
High-dimensional time series
Granger causality
Nonlinear state space models
Deep particle filtering
Bayesian deep neural networks
Advanced statistical modelling techniques which utilise satellite-based human settlement data within a robust geostatistical modelling framework have been developed to fill small area population data gaps and support development and humanitarian programmes, across many countries of the world. However, the detection of human settlement by remote sensing satellites can be affected by environmental and topographical factors such as canopy, snow or cloud cover, and topographical variations in mountainous landscapes, and similarities between buildings and surrounding landscapes. Here, using a Bayesian statistical hierarchical joint modelling approach, we extend existing geospatial estimation methods by simultaneously modelling human settlement detection probability and population density, to account for false detection rates in satellite observations within a coherent bottom-up population modelling strategy. Our methodology was validated using a simulation study and showed a reduction of between 21% to 49% in relative bias, and a 28% reduction in relative bias when applied to produce gridded population estimates (at 100m-by-100m resolution) for Democratic Republic of Congo.
Keywords
Bayesian Geospatial Joint Modelling, Small area population estimates, Remote sensing, False positive rates, Geoststitics, INLA-SPDE, Hierarchical models
Knowledge distillation is a powerful method for model compression, enabling the efficient deployment of complex deep learning models (teachers), including large language models. However, its statistical mechanisms remain unclear, and uncertainty evaluation is often overlooked, especially in real-world scenarios requiring diverse teacher expertise. To address these challenges, we introduce Multi-Teacher Bayesian Knowledge Distillation (MT-BKD), where a distilled student model learns from multiple teachers within the Bayesian framework. Our approach leverages Bayesian inference to capture inherent uncertainty in the distillation process. We introduce a teacher-informed prior, integrating external knowledge from teacher models and training data, offering better generalization, robustness, and scalability. Additionally, an entropy-based weighting mechanism adjusts each teacher's influence, allowing the student to combine multiple sources of expertise effectively. MT-BKD enhances interpretability, improves predictive accuracy, and provides uncertainty quantification. Our experiments show improved performance and robust uncertainty quantification, highlighting the strengths of MT-BKD.
Keywords
Uncertainty Quantification
Large Language Models
Bayesian Priors
Image Classification
Protein Subcellular Prediction