08: Clustering Multivariate Discrete Data with Partial Records

Utkarsh Dang Co-Author
University of Guelph
 
Sanjeena Dang Co-Author
Carleton University
 
Kevin Giddings First Author
 
Kevin Giddings Presenting Author
 
Monday, Aug 4: 10:30 AM - 12:20 PM
2116 
Contributed Posters 
Music City Center 
Being able to cluster data with incomplete records is vital in many disciplines. Here, we develop a model-based clustering approach for clustering multivariate discrete data with missing entries using a mixture of multivariate Poisson lognormal distributions. A multivariate Poisson lognormal distribution is a hierarchical Poisson distribution that can account for over-dispersion and can model the correlation between the variables. To illustrate the effectiveness of this method, we have designed a variety of simulation studies to show the robustness of this new method under different percentages of incomplete records and patterns of missing data. Additionally, the approach is used to demonstrate clustering partial records from a proteomics dataset.

Keywords

Clustering

Missing Data

Discrete Data

Multivariate Poisson Log Normal Distribution 

Abstracts


Main Sponsor

Biometrics Section