Comparing Two Categorical Gini Correlations with Applications to Classification Problems

Sameera Hewage Speaker
University of Louisiana at Lafayette
 
Yongli Sang Co-Author
University of Louisiana at Lafayette
 
Thursday, Aug 6: 10:30 AM - 12:20 PM
3739 
Contributed Papers 
We introduce a general inferential framework for comparing predictor importance in classification models with categorical responses. Our approach is based on the categorical Gini correlation (CGC), a dependence measure between numerical and categorical variables that captures the significance of a predictor for the response. To compare the importance of two predictors with respect to the same categorical outcome, we conduct hypothesis tests on their CGCs. The framework accommodates predictors of arbitrary and unequal dimensionalities. We derive the asymptotic distribution of the test statistic for hypothesis testing and show that the test is consistent. In addition, we propose a nonparametric bootstrap procedure as an alternative to the asymptotic normal-based test. Simulation studies demonstrate the empirical performance of the proposed tests, and applications to two real datasets illustrate their practical utility.

Keywords

categorical Gini correlation

comparing correlations

classification

Predictor importance

Categorical response

Nonparametric bootstrap 

Main Sponsor

Section on Statistical Learning and Data Science