Tuesday, Aug 6: 10:30 AM - 12:20 PM
6029
Contributed Posters
Oregon Convention Center
Room: CC-Hall CD
Main Sponsor
Section on Bayesian Statistical Science
Presentations
Let us start with a simple illustration of some fish that lives in shallow sea waters. An impermeable barrier for this fish would be a set of islands where there is no scenario in which fishes go over it. However, there might be sand patches with varying water coverage depending on the tide. These sand patches cannot be considered permanently impermeable barriers as fishes will be present, but will do so less often than in the normal non barrier area. This is a rather common set up, however there is no solution as we have no models for this case. We propose a Transparent barrier model that can deal with complex barrier scenarios. Moreover, it relies on a Matérn field making it as efficient as the classic stationary models in spatial statistics. The Transparent Barrier model is based on interpreting the Matérn correlation as a collection of paths through a Simultaneous Autoregressive (SAR) model, manipulating local dependencies to cut off paths crossing physical barriers and formulated as a stochastic partial differential equation (SPDE) for well-behaved discretization. Then, we include a transparency parameter to explicitly add barriers with different levels of permeability.
Keywords
Spatial distribution model
Non stationary Gaussian random field
Barrier model
Coastline and island problem
Stochastic Partial Differential Equations (SPDE)
INLA
Abstracts
Complex surveys have garnered substantial significance across diverse domains, spanning social sciences, public health, and market research. Their pivotal role lies in furnishing representative estimations while adeptly addressing the intricacies of survey design effects. When faced with the intricate complexities arising from the unknown effects of various covariates, parametric approaches may prove insufficient in handling the nuances associated with survey design impacts. Additionally, the Gaussian error distributional assumption would be inappropriate in many applications where the response distribution is heavy-tailed or skewed. This paper introduces the Bayesian Additive Regression Trees (BART) framework-a potent and adaptable approach tailored for analysing intricate survey data, specifically with subject weights. We propose an extension of BART to model heavy-tailed and skewed error distribution while considering subject weights. Its ability to account for the survey design features, handle non-linearity, and provide uncertainty estimates makes it a valuable tool for researchers and practitioners working with complex survey data.
Keywords
Bayesian nonparametrics
Bayesian additive regression trees
Complex survey
Abstracts
The Log Gaussian Cox process(LGCP) is arguably one of the most used model based strategy to analyze spatial point pattern(SPP) data. In practice, we usually have different models with increasing levels of complexity that we need to criticize, assess our assumptions and validate. This work is an attempt to provide a practical solution, under a Bayesian framework, to some of these problems using Cross Validation(CV). The challenge is that, contrary to traditional CV approach based on the expected log point-wise predictive density, in SPP analysis there is no concept of data-point to be removed, which then requires a group-wise or region-wise definition for the log predictive density. For this purpose, we propose a natural extension of the expected log predictive, better suited for LGCP, that could be termed expected log region-wise or group-wise predictive density. We also provide a very accurate, fast and deterministic approximation obtained from a single run of the model that we validate with Monte Carlo samples. We expect to make the solution available in the R-INLA software.
Keywords
Log gaussian cox process
cross validation
INLA
model selection
Abstracts
In observational studies, no unmeasured confounding (the ignorability of the treatment assignment) is typically assumed to identify the causal effect. However, this assumption is untestable and often fails to hold in practice. Recent work has shown that when a resistant population is available, the conditional average treatment effect on the treated can still be identified without assuming ignorability of the treatment assignment. This estimand Resistant Population Calibration Of Variance (RPCOVA), however requires estimation of the conditional variance function unlike other estimands including inverse probability weighting, differences in the conditional expectations, and the doubly robust estimands. We propose a nonparametric Bayesian approach for inference on this estimand using a dependent Dirichlet process to model the response. We establish weak consistency of the estimator and explore its finite sample performance in simulations.
Keywords
Causal Inference
Unmeasured Confounders
Gibbs Sampler
Non Parametric Bayesian
Conditional Average Treatment Effect on the Treated
Abstracts
In this poster presentation, we apply Hamiltonian Monte Carlo (HMC) to estimate three parameters of Burr III distribution and compare the estimates from HMC to the results from Metropolis-Hastings algorithm. And we use HMC to analyze Arthritis Relief Times Data.
Keywords
Hamiltonian Monte Carlo
Burr III Distribution
Abstracts
One of the bottlenecks to building semiconductor chips is the increasing cost required to develop chemical plasma processes that form the transistors and memory storage cells. These processes are still developed manually using highly trained engineers searching for a combination of tool parameters that produces an acceptable result on the silicon wafer. Here we study Bayesian optimization algorithms to investigate how artificial intelligence might decrease the cost of developing complex semiconductor chip processes. In particular, we create a controlled virtual process game to systematically benchmark the performance of humans and computers for the design of a semiconductor fabrication process. We find that human engineers excel in the early stages of development, whereas the algorithms are far more cost-efficient near the tight tolerances of the target. Furthermore, we show that a strategy using both human designers with high expertise and algorithms in a human first–computer last strategy can reduce the cost-to-target by half compared with only human designers.
Keywords
semiconductor fabrication process
recipe optimization
Bayesian optimization
virtual process
Abstracts
With the substantial increase in the availability of geostatistical data, statisticians are now equipped to make inference on spatial covariance from large datasets, which is critical in understanding spatial dependence. Traditional methods, such as Markov Chain Monte Carlo (MCMC) sampling within a Bayesian framework, can become computationally expensive as the number of spatial locations increases. As an important alternative to MCMC, Variational Inference approximates the posterior distribution through optimization. In this paper, we propose a nearest neighbor Gaussian process variational inference (NNGPVI) method to approximate the posterior. This method introduces nearest-neighbor-based sparsity in both the prior and the approximated posterior distribution. Doubly stochastic gradient methods are developed for the implementation of the optimization process. Our simulation studies demonstrate that NNGPVI achieves comparable accuracy to MCMC methods but with reduced computational costs. An analysis of satellite temperature data illustrates the practical implementation of NNGPVI and shows the inference results are matched with those obtained from the MCMC approach.
Keywords
Bayesian Modeling
Spatial Statistics
Variational Inference
Gaussian Process
Nearest Neighbor
Abstracts
Gaussian graphical models (GGMs) encode the conditional independence structure between multivariate normal random variables as zero entries in the precision matrix. They are powerful tools with diverse applications in genetics, portfolio optimization and computational neuroscience. Bayesian approaches have advantages over frequentist methods because they encourage graphs' sparsity, incorporate prior information, and account for uncertainty in the graph structure. However, due to the computational burden of MCMC, scalable Bayesian estimation of GGMs remains an open problem. We propose a novel approach that uses empirical Bayes nodewise regression that allows for efficient estimation of the precision matrix and flexibility in incorporating prior information in large dimensional settings. Empirical Bayes variable selection methods considered in our study include SEMMS, Zellner's g-prior, and nonlocal priors. If necessary, a post-filling model selection step is used to discover the underlying graph. Simulation results show that our Bayesian method compares favorably with competing methods in terms of accuracy metrics and excels in computational speed.
Keywords
Gaussian graphical model
high-dimensional statistics
network analysis
empirical Bayes
nodewise regression
sparsity
Abstracts
We consider the variable selection problem for linear models in the M-open setting, where the data generating process is outside the model space. We focus on the novel problem of Model Superinduction, which refers to the tendency of model selection procedures to exponentially favor larger models as the sample size grows, resulting in overparametrized models which induce severe computational difficulties. We prove the existence of this phenomenon for popular classes of model selection priors, such as mixtures of g-priors and the family of spike and slab priors. We further show this behavior is inescapable for any KL-divergence minimizing model selection procedure, so we seek to minimize its effects for large n, while preserving posterior consistency. We propose variants of the aforementioned priors that result in a slowly diminishing rate of prior influence on the posterior, which favors simpler models while preserving consistency. We further propose a model space prior which induces stronger model complexity penalization for large sample sizes. We demonstrate the efficacy of our proposed solutions via synthetic data examples and a case study using albedo data from GOES satellites.
Keywords
Model selection
Bayesian decision theory
M-open model comparison
Linear Models
Spike and Slab prior
g-prior
Abstracts
The amount of available covariates in medical data is expanding with each passing year, making identification of the most influential factors pivotal in survival regression modeling. Bayesian analyses focusing on variable selection are a common approach towards this problem. However, most use approximations of the posterior to perform this task. In this paper, we propose placing a beta prior directly on the model coefficient of determination (Bayesian R2), which acts as a shrinkage prior on the global variance of the predictors. Through reparameterization using an auxiliary variable, we are able to update a majority of the parameters with sequential Gibbs sampling, thus reducing reliance on approximate posterior inference and simplifying computation. Performance over competing variable selection priors is then showcased through an extensive simulation study in both censored and non-censored settings. Finally, the method is applied to identifying influential built environment risk factors impacting survival time of Medicare eligible patients in California with cardiovascular ailments.
Keywords
Survival
AFT
Global-Local
Bayesian
While Network Meta-Analysis (NMA) facilitates simultaneous assessment of multiple treatments, challenges such as sparse direct comparisons among treatments persist, making accurate estimation of the correlation between multiple treatments in arm-based NMA (AB-NMA) challenging. To address these challenges and complement the analysis, we develop a novel sensitivity analysis tool tailored for AB-NMA: a tipping point analysis within the Bayesian framework, specifically targeting correlation parameters, to assess their influence on the robustness of conclusions about relative treatment effects, including changes in statistical significance and the magnitude of point estimates. Applying the analysis to multiple NMA datasets with 112 treatment pairs, we identified tipping points in 13 pairs (11.6%) for significance change, and in 29 pairs (25.9%) for magnitude change with a threshold at 15%. Our results underscore potential commonality in tipping points, emphasizing the necessity of our proposed analysis, especially in networks with sparse direct comparisons or wide credible intervals of estimated correlation.
Keywords
network meta-analysis
correlation between multiple treatments
tipping point analysis
sensitivity analysis
robustness of research conclusion
statistical significance
Abstracts
In the identification of source problems within forensic science, the forensic examiner is tasked with providing a summary of evidence to allow a decision maker to evaluate the source of some evidence. The type of data encountered in the forensic identification of source problems often has a hierarchical structure, where there is a within and between source distribution for each object in a sample. One method of providing this summary of evidence is through a likelihood ratio (LR) or a Bayes factor (BF). With these methods, it is often the case that the two densities are estimated separately and then the ratio is reported, which can lead to instances where the resulting LR is large due to a small density in the denominator. In this work, we explore the use of the truncated normal distribution for use in LRs and BFs to attempt to alleviate this phenomenon. We also begin to characterize the robustness of these truncated normal LR methods.
Keywords
forensic source identification
value of evidence
likelihood ratio
truncated normal distribution
Abstracts
The recently developed semi-parametric generalized linear model (SPGLM) offers more flexibility as compared to the classical GLM by including the baseline or reference distribution of the response as an additional parameter in the model. However, some inference summaries are not easily generated under existing maximum-likelihood based inference (ML-SPGLM). This includes uncertainty in estimation for model-derived functionals such as exceedance probabilities. The latter are critical in a clinical diagnostic or decision-making setting. In this article, by placing a Dirichlet prior on the baseline distribution, we propose a Bayesian model-based approach for inference to address these important gaps. We establish consistency and asymptotic normality results for the implied canonical parameter. Simulation studies and an illustration with data from an aging research study confirm that the proposed method performs comparably or better in comparison with ML-SPGLM. The proposed Bayesian framework is most attractive for inference with small sample training data or in sparse-data scenarios.
Keywords
Ordinal regression
Nonparametric Bayes
Exceedance probabilities
Skewed Dirichlet
Dependent Dirichlet process
Abstracts
Bayesian adaptive designs with response adaptive randomization (RAR) have the potential to benefit more participants in a clinical trial. While there are many papers that describe RAR designs and results, there is a scarcity of works reporting the details of RAR implementation from a statistical point exclusively. In this paper, we introduce the statistical methodology and implementation of the trial Changing the Default (CTD). CTD is a single-center prospective RAR comparative effectiveness trial to compare opt-in to opt-out tobacco treatment approaches for hospitalized patients. The design assumed an uninformative prior, conservative initial allocation ratio, and a higher threshold for stopping for success to protect results from statistical bias. A particular emerging concern of RAR designs is the possibility that time trends will occur during the implementation of a trial. If there is a time trend and the analytic plan does not prespecify an appropriate model, this could lead to a biased trial. Adjustment for time trend was not pre-specified in CTD, but post hoc time-adjusted analysis showed no presence of influential drift.
Keywords
drift analysis
comparative effectiveness trial
Bayesian adaptive designs
Abstracts