Sparse Bayesian Clustering for Bounded Data via a Multivariate Beta Mixture Model

Abstract Number:

3361 

Submission Type:

Contributed Abstract 

Contributed Abstract Type:

Speed 

Participants:

Carmen Rodriguez Cabrera (1), Briana Stephenson (2)

Institutions:

(1) Harvard University, N/A, (2) Harvard T.H. Chan School of Public Health, N/A

Co-Author:

Briana Stephenson  
Harvard T.H. Chan School of Public Health

Speaker:

Carmen Rodriguez Cabrera  
Harvard University

Abstract Text:

We develop a Bayesian overfitted multivariate beta mixture model for clustering aggregated ecological data bounded between 0 and 1. Such data, common in social determinants of health (SDoH) research, pose challenges for standard clustering methods due to restrictive distributional assumptions and limited interpretability. The proposed model reparameterizes the multivariate beta distribution in terms of mean and concentration parameters, enabling direct interpretation of cluster-specific profiles while accommodating skewness inherent in the data. Integrated feature saliency operates on cluster means to induce sparsity by identifying variables that meaningfully drive clustering and shrinking uninformative features toward a shared mean. An overfitted mixture formulation supports data-driven inference on the number of clusters while preserving posterior uncertainty. We assess performance through simulation studies and apply the model to neighborhood-level SDoH data from the Agency for Healthcare Research and Quality, yielding interpretable ecological clusters. The framework generalizes to a broad class of bounded, aggregated multivariate data.

Keywords:

Bayesian mixture model
| multivariate beta distribution|sparse modeling|ecological data|feature saliency|

Sponsors:

Section on Bayesian Statistical Science

Tracks:

Unsupervised Learning

Can this be considered for alternate subtype?

Yes

Are you interested in volunteering to serve as a session chair?

Yes

I have read and understand that JSM participants must abide by the Participant Guidelines.

Yes

I understand that JSM participants must register and pay the appropriate registration fee by June 1, 2026. The registration fee is non-refundable.

I understand