Comparing Two Categorical Gini Correlations with Applications to Classification Problems
Yongli Sang
Co-Author
University of Louisiana at Lafayette
Thursday, Aug 6: 10:30 AM - 12:20 PM
3739
Contributed Papers
We introduce a general inferential framework for comparing predictor importance in classification models with categorical responses. Our approach is based on the categorical Gini correlation (CGC), a dependence measure between numerical and categorical variables that captures the significance of a predictor for the response. To compare the importance of two predictors with respect to the same categorical outcome, we conduct hypothesis tests on their CGCs. The framework accommodates predictors of arbitrary and unequal dimensionalities. We derive the asymptotic distribution of the test statistic for hypothesis testing and show that the test is consistent. In addition, we propose a nonparametric bootstrap procedure as an alternative to the asymptotic normal-based test. Simulation studies demonstrate the empirical performance of the proposed tests, and applications to two real datasets illustrate their practical utility.
categorical Gini correlation
comparing correlations
classification
Predictor importance
Categorical response
Nonparametric bootstrap
Main Sponsor
Section on Statistical Learning and Data Science
You have unsaved changes.