Monday, Aug 4: 2:00 PM - 3:50 PM
4070
Contributed Papers
Music City Center
Room: CC-202B
Main Sponsor
Section on Bayesian Statistical Science
Presentations
This study explores the estimation of conditional survival function in heavy-tailed distributions under right-censoring, a prevalent issue in fields such as medical science. We introduce a novel Bayesian Semiparametric approach by combining a Dirichlet Process Mixture (DPM) model with the Generalized Pareto Distribution (GPD), enabling robust estimation of conditional survival functions using a unified model. The DPM model efficiently models the central portions of the distribution below a specified threshold, while the GPD addresses the tail behavior beyond this threshold. Our approach uniquely accommodates random right censoring and incorporates covariate information, enhancing the estimation of conditional survival and hazard functions tailored to specific covariates. This paper presents an inaugural development of Bayesian models in this area, along with simulation studies and real-data applications, demonstrating significant enhancements in the accuracy and reliability of conditional survival function estimations over traditional methods.
Keywords
Bayesian Nonparametric
Dirichlet Process Mixture Model
Generalized Pareto Distribution
Right Censoring
Survival Curve Estimation
Extreme Quantile Estimation
Spatial clustering is crucial in disease mapping by identifying subregions with different patterns of disease incidence or mortality.
This study proposes a novel Bayesian spatial clustering method for multivariate spatial disease data, which allows for understanding geographic variations of multivariate disease patterns while accounting for both spatial information and dependence among multiple disease measurements. We develop a new random tele-connected graph partition model with an unknown number of clusters, which is capable of encouraging locally contiguous clusters and allowing for remote subregions to be clustered together.
We use this prior in a Bayesian hierarchical model to detect spatial clusters and estimate cluster-specific disease patterns and dependence across the multivariate disease variables. We develop a tailored Markov chain Monte Carlo (MCMC) algorithm for posterior inference, utilizing efficient doubly split-merge samplers taking advantage of graph algorithms. We illustrate our method with simulation studies and apply it to investigate the clustering patterns of county-level prostate cancer mortality rate decline across six southern U.S. states from 1985 to 2014.
Keywords
Inverse Wishart
Random Spanning Trees
Reversible-Jump MCMC
Spatial Clustering
Stirling Number of Second Kind
Semiparametric regression models containing linear and nonlinear additive components generalize multiple linear regression models.We prefer them to fully nonparametric models when some covariates have linear effects .While variable selection for multiple linear regression has been widely studied,work on additive partial linear models(APLMs) are more recent.We develop a Bayesian group selection method for APLMs using splines to approximate the nonlinear functions.Our work is based on a hierarchical model with priors on regression coefficients,spline coefficients,and model space.We prove model selection consistency even when the number of predictors grow nearly exponentially with sample size.We propose a scalable algorithm for exploring gigantic model spaces and efficiently detecting regions of high posterior probabilities.Various simulation setups are used to evaluate and compare our proposed approach's performance with other available methods. Analyzing data from a genome-wide association study with 360 observations on a particular trait of plants as response and nearly a million SNPs and 30000 gene expressions as predictors demonstrate scalability and performance of our approach.
Keywords
Genome wide association study
Hierarchical Model
Group selection
Stochastic Search
Additive Partial Linear Model
Posterior Prediction
We introduce a varying weight dependent Dirichlet process (DDP) model to implement a semi-parametric GLM. The model extends a recently developed semi-parametric generalized linear model (SPGLM) by adding a nonparametric Bayesian prior on the baseline distribution of the GLM. We show that the resulting model takes the form of an inhomogeneous normalized random measure that arises from exponential tilting of a normalized completely random measure. Building on familiar posterior simulation methods for mixtures with respect to normalized random measures we introduce posterior simulation in the resulting semi-parametric GLM model. The proposed methodology is validated through a series of simulation studies and is illustrated using data from a speech intelligibility study.
Keywords
Dependent Dirichlet process
Inhomogeneous normalized random measures
Density regression
Lévy-Khintchine representation
Semiparametric generalized linear model
We introduce a generalized Bayes framework for predicting individual-level restricted mean survival times (RMST) without relying on strict survival model assumptions. Our method employs an RMST-targeted loss function using inverse probability of censoring weights (IPCW), enabling the handling of informative censoring by modeling only the censoring distribution. We incorporate a flexible additive tree regression model and construct pseudo-Bayesian posteriors via model-averaging IPCW-conditional loss functions. Through simulations and application to a multi-site breast cancer cohort, we demonstrate improved predictive performance over standard survival machine learning methods. Additionally, we will describe how this framework can be extended to perform dynamic RMST prediction.
Keywords
dependent censoring, ensemble methods, Gibbs posterior, inverse weighting, loss function, survival analysis.
In this project we are performing clustering of observations such that the cluster membership is influenced by a set of covariates. To that end, we employ the Bayesian nonparameteric Common Atom Model (CAM), which is a nested clustering algorithm that utilizes a fixed group membership for each observation to encourage more similar clustering of members of the same group. CAM assumes each group has its own vector of cluster probabilities, which are themselves clustered to allow similar clustering for some groups. We extend CAM by treating the group membership as an unknown latent variable determined by the covariates. Thus, observations with similar predictor values will be in the same latent group and are more likely to be clustered together than observations with disparate predictors. We propose a Pyramid Group Model (PGM) that flexibly partitions the predictor space into these latent group memberships. The PGM operates similarly to a Bayesian CART process except that it uses the same splitting rule for at all nodes at the same tree depth. We propose a block Gibbs sampler for our model to perform posterior inference. Our methodology is demonstrated in simulation and real data.
Keywords
Nonparamteric, Clustering, Covariates, Latent group-membership, Pyramid Group Model, Block Gibbs sampler, Simulations, Real data