Advances in Statistical Distributions

Mostafa Zahed Chair
East Tennessee State University
 
Thursday, Aug 7: 8:30 AM - 10:20 AM
4204 
Contributed Papers 
Music City Center 
Room: CC-209B 

Main Sponsor

Section on Statistical Computing

Presentations

A hybrid high-dimensional matrix-free approach for Mixture of t-factor analyzers

Traditional MFA models, which rely on Gaussian assumptions, are sensitive to outliers and heavy-tailed distributions, making them less robust in complex real-world scenarios. The Mixture of t-Factor Analyzers (MtFA) model extends this framework by incorporating multivariate t-distributions, offering improved robustness to non-Gaussian data. Despite its advantages, the MtFA model faces computational challenges, particularly in high-dimensional settings, where the estimation of large covariance matrices and the iterative nature of Expectation-Maximization (EM) algorithms lead to scalability issues. In this work, we present a hybrid approach that integrates a matrix-free algorithm into the EM framework to efficiently estimate the parameters of the MtFA model. By leveraging the structure of the t-distribution within a factor analysis framework, our method retains the interpretability of traditional MFA while improving robustness to heavy-tailed noise and localized anomalies. We demonstrate the effectiveness of our approach through simulations and real-world datasets, showcasing its superior computational efficiency, resilience against outliers, while preserving clustering accuracy. 

Keywords

Mixture of factor analyzers

data clustering

matrix-free computations

expectation-maximization algorithm

dimensionality reduction

factor analysis 

Co-Author

Fan Dai, Michigan Technological University

First Author

Kazeem Kareem, Michigan Technological University

Presenting Author

Kazeem Kareem, Michigan Technological University

Extending the gain-probability analysis to the family of gamma distributions

Due to its flexibility in handling skewness, the family of gamma distributions is applicable to numerous domains where less flexible distributions prove inadequate. This paper extends gain-probability (G-P) analysis to the family of gamma distributions, providing a comprehensive investigation of its applicability in statistical modeling. G-P analyses are developed for both independent and dependent (matched) data scenarios. Monte Carlo studies demonstrate the stability and robustness of maximum likelihood estimators of parameters in gamma distributions within the G-P framework. Furthermore, applications to real-world streamflow data highlight the comparative advantages of G-P analysis using the gamma distribution family. To facilitate practical implementation, free online calculators are provided for computing gain probabilities under the proposed methodology. 

Keywords

gamma distribution

gain-probability analysis

statistical modeling

maximum likelihood estimator

Monte Carlo studies

streamflow data 

Co-Author(s)

Xiangfei Chen, Bridgewater State University
David Trafimow, New Mexico State University
Tonghui Wang, New Mexico State University
Boris Choy, The University of Sydney

First Author

Ziyuan Wang, University of Wisconsin Oshkosh

Presenting Author

Ziyuan Wang, University of Wisconsin Oshkosh

Generalized Skew Flexible Normal Distributions with Applications

Based on previous research featuring generalized distributions, we propose an extension to both generalized skew normal distributions introduced Kumar and Anusree (2011) and skew flexible normal distributions proposed by Gómez et al. (2011). The properties of this family of distributions are explored, and the parameters are estimated using the maximum likelihood method. Two simulation studies are conducted, along with two real data examples, to demonstrate the primary findings. 

Keywords

flexibility

bimodal

skew normal

asymmetric 

First Author

Tingting Tong, College of Charleston

Presenting Author

Tingting Tong, College of Charleston

Low-rank regularization of Fréchet regression models for distribution function response

Fréchet regression has emerged as a promising approach for modeling non-Euclidean response variables associated with Euclidean covariates. In this talk, we propose an estimation method with low-rank regularization for global Fréchet regression models. Specifically focusing on distribution function responses, we demonstrate how this framework employs low-rank regularization to enhance the efficiency and accuracy of the model fit. The proposed method enables more robust modeling and estimation, particularly in high-dimensional settings. We present a detailed theoretical analysis of the large-sample properties of the proposed estimator. Numerical experiments further validate these theoretical results. 

Keywords

Fréchet regression

Low-rank regularization

Distribution function responses

Quantile function responses

Wasserstein space

Optimal transport 

Co-Author

Hsin-Hsiung Huang, University of Central Florida

First Author

Kyunghee Han, University of Illinois at Chicago

Presenting Author

Kyunghee Han, University of Illinois at Chicago

On Generalized Inverse Pareto Family of Distributions: Properties and Applications

This study proposes new families of generalized inverse Pareto distributions using the T-R{Y} framework. Several
choices for the distributions of the random variables T and Y lead to generalized families of the random variable R,
which, in this study, is characterized by the inverse Pareto distribution. The generalized family of distributions is
thus named as T-inverse Pareto{Y} family. We consider the exponential, Weibull, log-logistic, logistic, Cauchy, and
extreme value distribution as potential choices for the distribution of the random variable Y . Specific members of
the T-inverse Pareto{Y} family exhibit symmetric, skewed to the right, skewed to the left, unimodal, or bimodal
density functions. Some statistical properties of the T-inverse Pareto{Y} family are investigated. The method of
maximum likelihood is proposed for estimating the distribution parameters and its performance is assessed using
a simulation study. Four real-world datasets from different disciplines are analyzed to demonstrate the flexibility of the
proposed T-inverse Pareto{Y} family of distributions. 

Keywords

T-R{Y} framework

Inverse Pareto distribution

Quantile function

Maximum likelihood estimation

Censoring 

Co-Author

Felix Famoye, Central Michigan University

First Author

Nirajan Budhathoki

Presenting Author

Nirajan Budhathoki

Revisiting the Spherical-Dirichlet Distribution: Corrections and Applications in Data Mining

Today, data mining and gene expressions are at the forefront of modern data analysis. In this paper, we present a revised and corrected version of the spherical-Dirichlet distribution, originally introduced by the same author. This updated formulation addresses key issues in the original development while maintaining the core structure and motivation behind the distribution. The spherical-Dirichlet distribution is designed to model vectors constrained to the positive orthant of the hypersphere, thereby eliminating unnecessary probability mass. We provide a thorough analysis of the distribution's fundamental properties, including updated normalizing constants and moments. Relationships with other distributions are further explored. Estimators based on classical inferential statistics, such as the method of moments and maximum likelihood estimation, are derived. To illustrate the impact of these corrections, we apply the revised distribution to two examples: one with simulated data and another using a real text mining dataset, mirroring the approach in the original work. The results highlight the improvements and practical implications of the proposed modifications. 

Keywords

Dirichlet Distribution

Probability Distributions

Hypersphere

Positive Quadrant

Data Mining

Spherical Dirichlet 

Co-Author

Jacob Harris, Texas A&M University Corpus Christi

First Author

Jose Guardiola, Texas A&M University-Corpus Christi

Presenting Author

Jose Guardiola, Texas A&M University-Corpus Christi