Thin But Not Forgotten: Deep Kernel Learning for Credit Risk Modeling with High-Dimensional Missingness
Longxiu Tian
Speaker
University of North Carolina Kenan-Flagler Business School
Monday, Aug 4: 11:25 AM - 11:50 AM
Invited Paper Session
Music City Center
Credit scores are integral to financial inclusion and credit access for consumers. Companies model credit risk and use credit scores to evaluate individual consumers' creditworthiness across diverse sectors including banking, lending, insurance, utilities, and rentals. Despite the widespread application, a substantial segment of the population, including minorities, young adults, recent immigrants, and those in lower-income neighborhoods, remain `unscorable' or `credit invisible' due to insufficient or non-existent credit history. We introduce a novel application of deep kernel learning within a Gaussian process regression framework to increase the number of scorable consumers and broaden credit access. This methodology is motivated by the need to statistically rationalize missingness in the credit history data collected by credit rating agencies, a critical issue that traditional credit scoring models fail to accommodate effectively. We apply our method to a comprehensive dataset encompassing 600,000 U.S. consumers and over 3,000 credit history report attributes. We undertake a counterfactual analysis on the welfare implications of earlier scoring for individuals deemed conventionally unscorable. Our findings challenge prevailing assumptions of a stark transition in the accuracy and precision of credit risk models when a consumer transitions from unscorable to scorable. Our results reveal a more nuanced reality: the transition at this 'boundary of scorability' is smoother than commonly perceived, suggesting that current credit scoring practices might be overly conservative. Our findings offer the potential for earlier and more inclusive scoring while maintaining the fidelity of credit risk models, benefiting consumers currently unscorable by conventional methods, as well as firms seeking to serve these consumers.
credit scoring
missing data
Gaussian process
Deep kernel learning
You have unsaved changes.