Bayesian Mixture Models for Histograms: with Applications to Large Datasets

Abstract Number:

3661 

Submission Type:

Contributed Abstract 

Contributed Abstract Type:

Paper 

Participants:

Richard Warr (1)

Institutions:

(1) Brigham Young University, N/A

First Author:

Richard Warr  
Brigham Young University

Presenting Author:

Richard Warr  
Brigham Young University

Abstract Text:

It is not uncommon for privacy or summarization purposes to receive data in a table or in histogram format with bins and associated frequencies. In this work we present a method that fits a mixture distribution to model the probability density function of the underlying population. We focus on a mixture of normal distributions, however the method could be generalized to mixtures of other distributions. A prior is placed on the number of mixture components which could be finite or countably infinite and inference is obtained using reversible jump MCMC. We demonstrate attractive properties of the method, which show a great deal of promise to modeling large data problems using a Bayesian nonparametric approach. Additionally, we consider the case of multiple histograms and cluster them using the Dirichlet process. This clustering allows for the sharing of information between populations and provides a posterior probability of homogeneity between populations.

Keywords:

Dirichlet Process |Data Privacy|Big Data| | |

Sponsors:

Section on Bayesian Statistical Science

Tracks:

Bayesian nonparametrics

Can this be considered for alternate subtype?

No

Are you interested in volunteering to serve as a session chair?

No

I have read and understand that JSM participants must abide by the Participant Guidelines.

Yes

I understand that JSM participants must register and pay the appropriate registration fee by June 1, 2024. The registration fee is non-refundable.

I understand