Statistical Computing for Innovative Distributions and Applications

Jiahe Li Chair
 
Thursday, Aug 8: 8:30 AM - 10:20 AM
5183 
Contributed Papers 
Oregon Convention Center 
Room: CC-C121 

Main Sponsor

Section on Statistical Computing

Presentations

Modeling single-cell multiplex immunofluorescence imaging data with parsimonious finite mixtures of Tukey g-and-h distributions

A mixture of 4-parameter Tukey's g-&-h distributions is proposed for fitting finite mixtures with Gaussian and non-Gaussian components. Since the likelihood of the Tukey's g-&-h mixtures does not have a closed analytical form, we propose a Quantile Least Mahalanobis Distance (QLMD) estimator for parameters of such mixtures. QLMD is an indirect estimator minimizing the Mahalanobis distance between the sample and model-based quantiles, and its asymptotic properties follow from the general theory of indirect estimation. We have developed a stepwise algorithm to select a parsimonious Tukey's g-&-h mixture model and implemented all proposed methods in the R package QuantileGH available CRAN. A simulation study was conducted to evaluate performance of the Tukey's g-&-h mixtures and compare to performance of mixtures of skew-normal or skew-t distributions. The Tukey's g-&-h mixtures were applied to model cellular expressions of Cyclin D1 protein in breast cancer tissues, and resulting parameter estimates evaluated as predictors of progression-free survival. 

Keywords

Finite Mixtures

Tukey’s g-&-h distribution

Indirect Estimator

Quantile Least Mahalanobis Distance

Cellular Protein Level 

Abstracts


Co-Author(s)

Misung Yi
Inna Chervoneva, Thomas Jefferson University, Sidney Kimmel Medical College

First Author

Tingting Zhan, Thomas Jefferson University

Presenting Author

Tingting Zhan, Thomas Jefferson University

The Quarter Von Mises Distribution

The world today increasingly relies on data science and statistics for analyzing diverse directional data, such as text data, health studies, image processing, wireless sensor networks, environmental monitoring, robotics, geology and materials science. In numerous instances, data exhibit positive orientation and necessitates probability distributions that are confined to positive regions, such as the positive quarter of the unit circle. This article addresses this need by introducing a novelty transformation of the von Mises distribution specifically designed for the positive quarter of the unit circle, filling a current gap in available distributions. The newly proposed distribution, referred to as the Quarter von Mises Distribution, has been thoroughly investigated in this work, including characterizing the distribution through moments and developing its main properties. Additionally, methods for estimating the distribution parameters using maximum likelihood estimation are presented, along with a hypothesis testing approach using the likelihood ratio test. Furthermore, practical data applications are demonstrated to showcase the effectiveness of these methods. Overall, this work 

Keywords

Von Mises distribution


circular statistics

directional statistics


positive quarter


unit circle 

View Abstract 2275

Co-Author(s)

Mason Myers, Texas A&M University Corpus Christi
Hassan Elsalloukh, University of Arkansas At Little Rock

First Author

Jose Guardiola, Texas A&M University-Corpus Christi

Presenting Author

Jose Guardiola, Texas A&M University-Corpus Christi

Circular (Directional) regression

Directional data has received increasing attention across a large number of scientific fields. In particular, such data assume some notion of an underlying circular distribution, which is characterized by some form of angular or degree direction. Naturally, modeling with such distributions when observed covariates are present necessitate the use of regression methods. However, circular variables have some specific characteristics which are different from linear variables, so traditional linear models need an appropriate transformation to become circular models. This paper extends the simple circular-circular regression model and the circular-linear model into multivariate circular-circular regression models, and models based on both circular and linear covariates. We further develop a degree-determination algorithm that is used in the aforementioned models. This algorithm makes use of classic dimension reduction methods (principal component analysis and partial least squares) applied to multivariate circular regression models. Performance of our methods are investigated and compared based on both simulated and real datasets. 

Keywords

Circular data

Regression model

Dimension reduction

Determination of degree of polynomial 

View Abstract 2385

Co-Author

Derek Young, University of Kentucky

First Author

Pengyuan Chen, University of Kentucky

Presenting Author

Pengyuan Chen, University of Kentucky

Efficient computation of pattern statistics for many different input probabilities

A Markov chain-based approach yields an efficient computation mechanism to compute a single distribution of a pattern statistic in a Markovian sequence. However, if distributions are needed for many values of input probabilities, the entire computation needs to be repeated. The method forwarded in this work avoids the need to redo recursive updates of probabilities. Instead, counts of data strings with various values of sufficient statistics are updated recursively. The final counts are then used to reconstruct probabilities for the many input probabilities, improving efficiency. In this talk, the methodology is laid out systematically. 

Keywords

Markovian data, parameter-free computation, recursive computation 

View Abstract 3281

Co-Author(s)

Nonhle Mdziniso, Rochester Institute of technology
Elie Alhajjar, RAND Corporation
Laurent Noe, CRIStAL (UMR 9189 Lille University/CNRS) - INRIA Lille Nord-Europe,

First Author

Donald Martin, NC State University

Presenting Author

Donald Martin, NC State University

Inference on Two-Parameter Maxwell Distribution: Two-Sample Case

The two-parameter Maxwell distribution is commonly used in life-testing and reliability analysis due to its smoothly increasing failure rate. Our study focuses on constructing confidence intervals (CIs) for the difference between means and ratio of means of two independent Maxwell distributions. We propose CIs based on the fiducial approach, approximate fiducial approach (also known as modified normal-based approximation), and parametric bootstrap (PB) method. We compare these methods based on their coverage probability and precision. We extend these methods to find CIs for a difference between percentiles and a ratio of two independent Maxwell distributions. Specifically, we develop and evaluate CIs for the ratio of the 5th percentiles and the ratio of the medians for coverage probability and precision accuracy. We illustrate these methods using two examples with real-life data. Although the PB confidence intervals are more efficient than the fiducial CIs in some situations, the approximate fiducial CIs are very simple to compute and are comparable with the PB CIs in most cases. 

Keywords

coverage probability

equivariant estimator

fiducial method

location-scale family

MLEs

PB approach 

View Abstract 2350

Co-Author

Kalimuthu Krishnamoorthy

First Author

Faysal Ahmed Chowdhury, Florida Gulf Coast University

Presenting Author

Faysal Ahmed Chowdhury, Florida Gulf Coast University

On Generalized Inverse Pareto Family of Distributions: Properties and Applications

This study proposes new families of generalized inverse Pareto distributions using the T-R{Y} framework. Different choices for the distributions of the random variables T and Y lead to generalized families of the random variable R, which, in this study, is characterized by the inverse Pareto distribution. The generalized family of distributions is thus named as T-inverse Pareto{Y} family. We consider the exponential, Weibull, log-logistic, logistic, Cauchy, and extreme value distribution as potential choices for the distribution of the random variable Y. Specific members of the T-inverse Pareto{Y} family exhibit symmetric, skewed to the right, skewed to the left, unimodal, or bimodal density functions. Some statistical properties of the T-inverse Pareto{Y} family are investigated. The method of maximum likelihood is proposed for estimating the distribution parameters and its performance is assessed using a simulation study. Four real datasets from different disciplines are analyzed to demonstrate the flexibility of the proposed T-inverse Pareto{Y} family of distributions. 

Keywords

T-R{Y} framework

Inverse Pareto distribution

Quantile function

Maximum likelihood estimation

Censoring 

View Abstract 2840

Co-Author

Felix Famoye, Central Michigan University

First Author

Nirajan Budhathoki

Presenting Author

Nirajan Budhathoki

Minimum Covariance Determinant: Spectral Embedding and Subset Size Determination

This paper introduces several ideas to the minimum covariance determinant problem for outlier detection and robust estimation of means and covariances. We leverage the principal component transform to achieve dimension reduction, paving the way for improved analyses. Our best subset selection algorithm strategically combines statistical depth and concentration steps. To ascertain the appropriate subset size and number of principal components, we introduce a novel bootstrap procedure that estimates the instability of the best subset algorithm. The parameter combination exhibiting minimal instability proves ideal for the purposes of outlier detection and robust estimation. Rigorous benchmarking against prominent MCD variants showcases our approach's superior capability in outlier detection and computational speed in high dimensions. Application to a fruit spectra data set and a cancer genomics data set illustrates our claims. 

Keywords

Robustness

Outliers

Principal component analysi

Statistical depth

Bootstrap

Algorithm instability 

View Abstract 1954

Co-Author(s)

Yichi Zhang, North Carolina State University
Kenneth Lange, Department of Computational Medicine, UCLA

First Author

Qiang Heng

Presenting Author

Qiang Heng