Predictor-Informed Bayesian Nonparametric Clustering

Jeremy Gaskins Co-Author
University of Louisville
 
Md Yasin Ali Parh First Author
 
Md Yasin Ali Parh Presenting Author
 
Monday, Aug 4: 3:20 PM - 3:35 PM
1177 
Contributed Papers 
Music City Center 
In this project we are performing clustering of observations such that the cluster membership is influenced by a set of covariates. To that end, we employ the Bayesian nonparameteric Common Atom Model (CAM), which is a nested clustering algorithm that utilizes a fixed group membership for each observation to encourage more similar clustering of members of the same group. CAM assumes each group has its own vector of cluster probabilities, which are themselves clustered to allow similar clustering for some groups. We extend CAM by treating the group membership as an unknown latent variable determined by the covariates. Thus, observations with similar predictor values will be in the same latent group and are more likely to be clustered together than observations with disparate predictors. We propose a Pyramid Group Model (PGM) that flexibly partitions the predictor space into these latent group memberships. The PGM operates similarly to a Bayesian CART process except that it uses the same splitting rule for at all nodes at the same tree depth. We propose a block Gibbs sampler for our model to perform posterior inference. Our methodology is demonstrated in simulation and real data.

Keywords

Nonparamteric, Clustering, Covariates, Latent group-membership, Pyramid Group Model, Block Gibbs sampler, Simulations, Real data 

Main Sponsor

Section on Bayesian Statistical Science