SPEED 3: Bayesian Methods & Applications, Part 1

Apurva Bhingare Chair
Bristol Myers Squibb
 
Monday, Aug 5: 8:30 AM - 10:20 AM
5030 
Contributed Speed 
Oregon Convention Center 
Room: CC-D135 

Presentations

A Bayesian Non-parametric Framework for Community Detection in Multi-way Interaction Network

Community detection is a fundamental task in network analysis. Learning underlying network structures has brought deep insights into the understanding of complex systems. Real network data often arise via a series of interactions, with each interaction involving more than two nodes, e.g. multi-way interaction. The block components differ by different interactions. While many methods have focused on clustering nodes into blocks, few account for the fact that interactions may exhibit clustering as well. In this project, we introduce a Bayesian non-parametric framework to study multi-way interaction networks with joint modeling of latent node-level block labels and latent interaction-level labels. We will discuss challenges regarding the identifiability of latent labels in this framework and show the demonstration in simulated data. A Gibbs sampling-based algorithm is derived. We will conclude the presentation with the application of our proposed method to the Medicare claim data over the years and the potential medical implications for future. 

Keywords

Community Detection in Network data

Bayesian non-parametric framework

Latent Class Model 

View Abstract 2576

Co-Author(s)

Jukka-Pekka Onnela
Max Wang, Harvard University

First Author

Yuhua Zhang, Harvard University

Presenting Author

Yuhua Zhang, Harvard University

A Random Effects Hierarchical Model with Bayes Prior

Longitudinal studies and repeated measure are key in the study of correlated data. Irimata, Wilson 2017 presented a measure of these correlations when measuring the strength of association between an outcome of interest and multiple binary outcomes, as well as the clustering present due to correlation. They addressed the set of correlation in a hierarchical model with random effects. Estimation of parameters in such models is hampered by the association between time dependent binary variables and the outcome of interest. Wilson, Vazquez, Chen 2020 described marginal models in the analysis of correlated binary data with time dependent covariates. Their research addressed carryover effects on covariates and covariates unto responses through marginal models.
This research uses a random effects model with multiple outcomes to account for the changing impact of responses on covariates and covariates on response. It requires a series of distributions to address time-dependent covariates, as each random effect relies on a distribution. It differs from that of Wilson, Vazquez, Chen with their marginal models but is based on random effects in modeling (conditional model) feedback effects. 

Keywords

Longitudinal studies

Correlation

Binary Models 

View Abstract 3172

Co-Author(s)

Vrijesh Tripathi, The University of the West Indies
Jeffrey Wilson, Arizona State University

First Author

Lori Selby, Arizona State University

Presenting Author

Lori Selby, Arizona State University

Addressing Unmeasured Confounders in Cox Hazard Models Using Nonparametric Bayesian Approaches

In observational studies, presence of unmeasured confounders is a crucial challenge in accurately estimating desired causal effects. To calculate the hazard ratio (HR) in Cox proportional hazard models, instrumental variable methods such as Two-Stage Residual Inclusion (Martinez-Camblor et al., 2019) and Limited Information Maximum Likelihood (Orihara, 2022) are typically employed. However, these methods have several concerns, including the potential for biased HR estimates and issues with parameter identification. In this presentation, we introduce a novel nonparametric Bayesian method designed to estimate an unbiased HR, addressing concerns related to parameter identification. Our proposed method consists of two phases: 1) detecting clusters based on the likelihood of the exposure variable, and 2) estimating the hazard ratio within each cluster. Although it is implicitly assumed that unmeasured confounders affect outcomes through cluster effects, our algorithm is well-suited for such data structures. We will present simulation results to evaluate the performance of our method. 

Keywords

general Bayes

instrumental variable

Mendelian randomization

nonparametric Bayes

unmeasured confounders 

View Abstract 2676

Co-Author

Masataka Taguri, Tokyo Medical University

First Author

Shunichiro Orihara, Tokyo Medical University

Presenting Author

Shunichiro Orihara, Tokyo Medical University

Bayesian Approach to Sex-specific Mendelian Randomization Analysis

Mendelian randomization (MR) analysis is widely used in genetic epidemiology to estimate the causal effect of a risk factor on an outcome of interest. Increasing evidence shows the importance of sex differences in health and disease mechanisms. However, research on sex-specific causal effects is lacking due to limited sex-specific GWASs. Motivated by GWASs from the Million Veteran Program, in which only 10% of individuals are female, a major limitation to MR analyses is weak IVs, which manifest as poor variant-exposure effect estimates that lead to unstable causal effect estimates. We propose a Bayesian framework to stabilize female exposure GWAS effect sizes by borrowing information from the male population. By specifying a particular prior distribution on female exposure GWAS effect sizes, we demonstrate two special cases of posterior means, including the inverse variance-weighted meta-analysis and the adaptive weight approach. We perform a series of simulation studies to examine the performance of our proposed Bayesian approach in MR analysis. Finally, we apply the proposed method to estimate the causal effects of sleep phenotypes on cardiovascular-related diseases 

Keywords

MR analysis

Bayesian framework

Sex-specific causal effect 

View Abstract 3392

Co-Author(s)

Nuzulul Kurniansyah, Department of Medicine, Brigham and Women’s Hospital
Daniel F Levey, Department of Psychiatry, Yale University School of Medicine
Joel Gelernter, Department of Psychiatry, Yale University School of Medicine
Jennifer Huffman, Center for Population Genomics, MAVERIC, VA Boston Healthcare System, Boston,
Kelly Cho, Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare
Peter Wilson, Division of Cardiology, Department of Medicine, Emory University School of Medicine
Daniel Gottlieb, Division of Sleep Medicine, Harvard Medical School
Kenneth Rice, University of Washington
Tamar Sofer, Beth Israel Deaconess Medical Center

First Author

Yu-Jyun Huang, Beth Israel Deaconess Medical Center

Presenting Author

Yu-Jyun Huang, Beth Israel Deaconess Medical Center

Bayesian Dirichlet Regression For Correlated Compositional Outcomes.

It is common to observe compositional data in various fields, with a growing interest in considering compositional data as outcomes in regression settings. The motivation for this paper stems from a study investigating the impact of sleep restriction on physical activity outcomes. The compositional outcomes were measured under both short sleep and healthy sleep conditions for the same participants. To address the dependence observed in the compositional outcomes, we introduce a Mixed-Effects Dirichlet Regression (MEDR) model. This model is designed to account for correlated outcomes arising from repeated measurements on the same subject or clustering within a group. We utilize an alternative parameterization of the Dirichlet distribution, enabling the modeling of both mean and dispersion components. Our approach offers Markov Chain Monte Carlo (MCMC) tools that are easily implementable in the programming languages Stan and R. We apply the proposed MEDR model to an experimental sleep study and illustrate its performance through simulation studies. 

Keywords

Compositional data, Bayesian Dirichlet regression, Markov chain Monte Carlo, physical activity, sleep restriction. 

View Abstract 2382

Co-Author(s)

Xia Wang, University of Cincinnati
Nanhua Zhang, Cincinnati Children's Hospital Medical Center

First Author

Eric Odoom, University of Cincinnati

Presenting Author

Eric Odoom, University of Cincinnati

Bayesian inference of antibody evolutionary dynamics using multitype branching processes

When our immune systems encounter foreign invaders, the B cells that produce our antibodies undergo a cyclic process of mutation and selection, competing to provide a refined immune response to the specific invader. To study how the immune system recognizes when the antibodies are sufficiently improved, we examine the state of the immune system in mice after an exposure to an artificial foreign agent by collecting genetic sequences of B cells. This experiment produces data only at one time point, so we lose all information about the preceding evolutionary process that mutates and selects B cells to optimize antibody efficiency. In this paper, we develop a multitype branching process model that integrates over unobserved antibody evolutionary histories and leverages parallel replications of immune responses we observed in experimentation. Our fully Bayesian approach, equipped with an efficient likelihood calculation algorithm and Markov chain Monte Carlo based approximation of the posterior, allows us to infer the currently-unknown functional relationship between the fitness of B cells that produce antibodies and the binding strength of these antibodies to pathogen-infected cells. 

Keywords

immunology

phylogenetics

phylodynamics

stochastic processes 

View Abstract 2797

Co-Author(s)

William DeWitt, Postdoctoral Researcher
Yun Song, University of California-Berkeley
Frederick Matsen, Fred Hutchinson Cancer Research Center
Volodymyr Minin, University of California-Irvine

First Author

Thanasi Bakis

Presenting Author

Thanasi Bakis

Bayesian Spatio-temporal Regression Models for Group Testing Data

Group testing is a procedure that tests groups of biospecimens instead of individual ones. If a pool tests positive, subsequent tests are usually conducted on the individuals who contributed to the pool to determine their disease status; if a pool tests negative, all are considered disease-free. Under relatively low disease prevalence, group testing reduces required diagnostic tests and the associated costs. Spatio-temporal dependencies can arise in testing data collected across multiple locations and time points. However, existing group testing models are not appropriate for spatio-temporal data. In this study, we propose two Bayesian spatio-temporal regression models for discrete-time areal group testing data. We apply the proposed models to COVID-19 testing data from 4,516 South Carolina residents (2020-2022) and 19,152 Central New York residents (2020). Our models are suitable for various group testing protocols and can estimate the sensitivity and specificity of diagnostic tests. Moreover, the models also produce forecast maps for future infection prevalence. This study showcases the effectiveness of group testing in forecasting infectious diseases across different locations. 

Keywords

group testing

Bayesian spatio-temporal model

infectious disease forecasting

conditional autoregressive model

vector autoregressive model

COVID-19 

View Abstract 2249

Co-Author(s)

Bo Cai, University of South Carolina
Alexander McLain, University of South Carolina
Melissa Nolan, University of South Carolina
Stella Self, University of South Carolina

First Author

Xingpei Zhao, University of South Carolina

Presenting Author

Xingpei Zhao, University of South Carolina

Emulating Functional Output of Dark Matter Power Spectra Using Deep Gaussian Processes

We construct a framework combining Gaussian processes and hierarchical modeling to estimate and emulate dark matter power spectra from multiple, dependent computer model simulations. We model the spectra as deep Gaussian processes, and consider multiple candidate models for the covariance structure of the simulations' deviations from the true spectra. Applying the best candidate model to the expensive simulations, we estimate the underlying power spectrum for a given cosmology. With these estimates calculated across multiple cosmologies, we build an emulator using functional principal components (and Gaussian processes on the weights) for unobserved cosmologies. We obtain promising results comparing against an existing method. 

Keywords

Gaussian processes

Bayesian modeling

Hierarchical modeling

Deep Gaussian processes

Cosmology 

View Abstract 2947

Co-Author(s)

Annie Booth, NC State University
David Higdon, Virginia Tech
Marco Ferreira, Virginia Tech

First Author

Stephen Walsh, Elms College

Presenting Author

Stephen Walsh, Elms College

Estimation of Optimal Treatment Regimes with Irregularly Observed Data via Bayesian Joint Modelling

Optimal dynamic treatment regimes (DTR) are sequences of decision rules aimed at determining the sequence of treatments tailored to patients, maximizing a long-term outcome. While conventional DTR estimation uses longitudinal data, there is little work on devising methods that use irregularly observed data to infer optimal DTRs. In this work, we first extend the target trial framework -- a paradigm to estimate specified statistical estimands under hypothetical scenarios using observational data -- to the DTR context; this extension allows treatment regimes to be defined with intervenable visit times. We propose an adapted version of G-computation marginalizing over random effects for rewards that encapsulate a treatment strategy's value. To estimate components of the G-computation formula, we then articulate a Bayesian joint model to handle correlated random effects between the outcome, visit and treatment processes. We also extend this model to allow flexible specifications of the random effects' distribution. Lastly, we show via simulation studies that failure to account for the observational treatment and visit processes produces bias in the estimation of regime rewards. 

Keywords

Dynamic treatment regime

Bayesian joint modelling

Target Trial Framework

G-computation

Irregularly observed data 

View Abstract 3761

Co-Author(s)

Eleanor Pullenayegum, Hospital for Sick Children
Olli Saarela, University of Toronto

First Author

Larry Dong

Presenting Author

Larry Dong

Hierarchical Bayesian Spatial Methods for Exposure Buffer-Size Selection in Place-Based Studies

Place-based epidemiology studies often rely on circular buffers to define exposure at spatial locations. Buffers are a popular choice due to their simplicity and alignment with public health policies. However, the buffer radius is often chosen relatively arbitrarily and assumed constant across space, which may result in biased effect estimates if these assumptions are violated. To address these limitations, we propose a novel method to inform buffer size selection and allow for spatial heterogeneity in radii across outcome units. Our model uses a spatially structured Gaussian process to model buffer radii as a function of covariates and spatial random effects, and a modified Bayesian variable selection framework to select the most appropriate radius distance. We perform a simulation study to understand the properties of our new method and apply our proposed method to a study of health care access and health outcomes in Madagascar. We find that our method outperforms existing approaches in terms of estimation and inference for key model parameters. By relaxing rigid assumptions about buffer characteristics, our method offers a flexible, data-driven approach to exposure definition. 

Keywords

Bayesian methods

exposure buffers

geographic and spatial uncertainty

place-based epidemiology

health studies 

View Abstract 2662

Co-Author

Joshua Warren, Yale University

First Author

Saskia Comess, Stanford University

Presenting Author

Saskia Comess, Stanford University

Integrating Side Information in Bayesian Variable Selection via Variational Inference

Bayesian variable selection (BVS) is a powerful tool in high-dimensional settings, as it incorporates prior information and facilitates model selection simultaneously. However, the potential of side information, such as previous studies or expert knowledge, to identify influential variables is often underutilized in BVS applications. For example, in a study of genetic markers of nicotine metabolite ratio p-values from previous studies are available. These p-values may be useful in determining the sparsity structure of regression coefficients, and enhance the accuracy of model results. Under the mean-field assumption, employing a spike-and-Gaussian-slab prior, variational Bayesian (VB) with the coordinate ascent variational inference (CAVI) algorithm can be used to approximate the posterior distributions. To integrate side information into variable selection, we augment our sparse linear regression model with a conditional logistic model on the impact of the side information on the variable selection indicators. In this enhanced framework, the logistic VI predominantly governs the prior inclusion probability within the spike-and-slab prior. Our simulation studies suggest that incorp 

Keywords

Bayesian variable selection

side information

variational inference 

View Abstract 2227

First Author

Zichun Meng

Presenting Author

Zichun Meng

Leveraging historical data to compute predictive probability of success using Dirichlet Process prio

Predictive probability of success, or PPoS, is a crucial decision-making tool that predicts trial success and is computed at various phases of the drug development process. We propose a Dirichlet Process meta-analytic prior (DP-MAP), a non-parametric approach to account for the statistical heterogeneity among the treatment effects across all the historical studies considered for constructing an informative prior, for calculating PPoS. It allows for a more robust inference in the case of prior-data conflict. As the basic premise is to borrow only if the historical information is relevant, some prior trials may concur with or disagree with the current data. DP provides a flexible solution; that is, DP offers the chance to borrow from earlier trials based on their similarity with the current trial and resolves the prior data conflict.
In this paper, we assess the model fit of DP-MAP prior and compare it with the model fit for both the standard meta-analytic predictive prior (MAP) and robust-meta-analytic prior (rMAP) approaches. We utilize a real data example from historical RRMM trials and demonstrate PPoS calculations at the design stage and interim analysis of the ongoing trial. 

Keywords

Dirichlet process prior

Predictive probability of success

interim analysis

clinical trials

Bayesian statistics

go-no go decision 

View Abstract 2582

Co-Author

Ram Tiwari, Bristol Myers Squibb

First Author

Archie Sachdeva

Presenting Author

Archie Sachdeva

Model-based clustering via Bayesian estimation of Gaussian graphical models and precision matrices

Finite Gaussian mixture models are ubiquitous for model-based clustering of continuous data. These models' parameters scale quadratically with the number of variables. A rich literature exists on parsimonious models via covariance matrix decompositions or other structural assumptions. However, these models do not allow for direct estimation of conditional independencies via sparse precision matrices. Here, we introduce mixtures of Gaussian graphical models for model-based clustering with sparse precision matrices. We employ recent developments in Bayesian estimation of Gaussian graphical models to circumvent the doubly intractable partition function of the G-Wishart distribution and use conditional Bayes factors for model comparison in a Metropolis-Hastings framework. We extend this to mixtures of Gaussian graphical models and apply this to estimate conditional independence structures in the different mixture components via fast joint estimation of the graphs and precision matrices. Our framework results in a parsimonious model-based clustering of the data and provides conditional independence interpretations of the mixture components. 

Keywords

Model-based clustering

Finite Gaussian mixture models

Precision matrix

Gaussian graphical model

Markov chain Monte Carlo (MCMC)

G-Wishart Distribution 

View Abstract 3201

Co-Author

Adrian Dobra, University of Washington

First Author

David Marcano

Presenting Author

David Marcano

Non-Euclidean Bayesian Constraint Relaxation via Divergence-to-Set Priors

Constraints on parameter spaces promote various structures in Bayesian inference. Simultaneously, they present methodological challenges, such as efficiently sampling from the posterior. While recent work has tackled this important problem through various approaches of constraint relaxation, much of the underlying machinery assumes the parameter space is Euclidean-an assumption that doesn't hold in many settings. Building on the recently proposed class of distance-to-set priors (Presman and Xu, 2023), this talk explores extensions of constraint relaxation in non-Euclidean spaces. We propose a natural extension of these priors, which we call (Bregman) divergence-to-set priors, exemplify many settings where they can be leveraged, and demonstrate how techniques originally from an optimization algorithm known as mirror descent can utilized for non-Euclidean Bayesian constraint relaxation. 

Keywords

Constraint relaxation

Hamiltonian Monte Carlo

Bregman divergence

MCMC Sampler 

View Abstract 3277

Co-Author

Jason Xu

First Author

Rick Presman, Duke University

Presenting Author

Rick Presman, Duke University

Parameter-expanded data augmentation for analyzing categorical data using multinomial probit models.

The multinomial probit model is a popular tool for examining nominal categorical data. However, due to the model's identification issue which requires restricting the first element of the covariance matrix of the latent variables, it poses a daunting challenge for researchers to develop efficient Markov chain Monte Carlo (MCMC) methods. The parameter-expanded data augmentation (PX-DA) is a well-known technique that introduces a working/artificial parameter or parameter vectors to transform an identifiable model into a non-identifiable one. This transformation can improve the mixing and convergence of the data augmentation components. Hence, we propose a PX-DA algorithm to analyze the categorical data using multinomial probit models. We examine both identifiable and non-identifiable multinomial probit models and develop the corresponding MCMC algorithms. The constructed non-identifiable model successfully bypasses a Metropolis-Hastings algorithm for sampling the covariance matrix, resulting in enhanced convergence and improved mixing of the MCMC components. We conduct simulation studies to demonstrate our proposed methods and apply them to the real data from the Six Cities study. 

Keywords

multinomial probit model

latent variable

parameter-expanded

data augmentation

MCMC

non-identifiable model 

View Abstract 3304

Co-Author

Xiao Zhang, Michigan Technological University

First Author

Suwash Silwal

Presenting Author

Suwash Silwal

Predicting Multi-wave COVID-19 Cases Using Logistic Growth Modeling

During the COVID-19 outbreak, the global community encountered numerous challenges, underscoring the necessity for effective prediction models to inform public health interventions and optimize resource allocation. Traditional compartmental models like the SIR (Susceptible-Infected-Recovered) model and its variants have been employed to predict disease prevalence. However, these models have limitations; they struggle to detect multiple waves and are sensitive to initial parameters, necessitating time-consuming parameter tuning. In this study, we propose an approach to identify multi-wave patterns in COVID-19 cases. Our method involves utilizing Bayesian changepoint detection to identify multiple waves, followed by the application of a logistic growth model to estimate daily COVID-19 cases, including hospitalizations and ICU patients. We evaluate the model's accuracy using Mean Absolute Percentage Errors (MAPE). 

Keywords

SIR model

Bayesian changepoint

Mean Absolute Percentage Errors 

View Abstract 3607

Co-Author

Bong-Jin Choi, North Dakota State University

First Author

Idamawatte Gedara Idamawatta

Presenting Author

Idamawatte Gedara Idamawatta

Specifying prior distributions in reliability applications

Especially when facing reliability data with limited information (e.g.,
a small number of failures), there
are strong motivations for using Bayesian inference methods.
These include the option to use information
from physics-of-failure or previous experience with a failure mode
in a particular material to specify an informative
prior distribution. Another advantage is the ability
to make statistical inferences without
having to rely on specious (when the number of failures is small)
asymptotic theory needed to justify
non-Bayesian methods. Users of non-Bayesian methods are faced with
multiple methods of constructing uncertainty intervals (Wald,
likelihood, and various bootstrap methods) that can give
substantially different answers when there is little information in
the data. For Bayesian inference, there is only one method---but
it is necessary to provide a prior distribution to fully specify the model.
This presentation reviews some of this work and provides, evaluates, and illustrates principled
extensions and adaptations of these methods to the practical
realities of reliability data (e.g., non-trivial censoring). 

Keywords

Bayesian inference

default prior

Reliability

few failures

noninformative prior

reference prior 

View Abstract 3050

Co-Author(s)

Colin Lewis-Beck
Jarad Niemi, Iowa State University
William Meeker, Iowa State University

First Author

Qinglong Tian

Presenting Author

Colin Lewis-Beck