Revisiting the Spherical-Dirichlet Distribution: Corrections and Applications in Data Mining

Jacob Harris Co-Author
Texas A&M University Corpus Christi
 
Jose Guardiola First Author
Texas A&M University-Corpus Christi
 
Jose Guardiola Presenting Author
Texas A&M University-Corpus Christi
 
Thursday, Aug 7: 9:50 AM - 10:05 AM
2022 
Contributed Papers 
Music City Center 
Today, data mining and gene expressions are at the forefront of modern data analysis. In this paper, we present a revised and corrected version of the spherical-Dirichlet distribution, originally introduced by the same author. This updated formulation addresses key issues in the original development while maintaining the core structure and motivation behind the distribution. The spherical-Dirichlet distribution is designed to model vectors constrained to the positive orthant of the hypersphere, thereby eliminating unnecessary probability mass. We provide a thorough analysis of the distribution's fundamental properties, including updated normalizing constants and moments. Relationships with other distributions are further explored. Estimators based on classical inferential statistics, such as the method of moments and maximum likelihood estimation, are derived. To illustrate the impact of these corrections, we apply the revised distribution to two examples: one with simulated data and another using a real text mining dataset, mirroring the approach in the original work. The results highlight the improvements and practical implications of the proposed modifications.

Keywords

Dirichlet Distribution

Probability Distributions

Hypersphere

Positive Quadrant

Data Mining

Spherical Dirichlet 

Main Sponsor

Section on Statistical Computing