Randall Davis, Andrew W. Lo, Sudhanshu Mishra, Arash Nourian, Manish Singh, Nicholas Wu, Ruixun Zhang
{"title":"Explainable Machine Learning Models of Consumer Credit Risk","authors":"Randall Davis, Andrew W. Lo, Sudhanshu Mishra, Arash Nourian, Manish Singh, Nicholas Wu, Ruixun Zhang","doi":"10.2139/ssrn.4006840","DOIUrl":null,"url":null,"abstract":"In this work, the authors create machine learning (ML) models to forecast home equity credit risk for individuals using a real-world dataset and demonstrate methods to explain the output of these ML models to make them more accessible to the end user. They analyze the explainability for various stakeholders: loan companies, regulators, loan applicants, and data scientists, incorporating their different requirements with respect to explanations. For loan companies, they generate explanations for every model prediction of creditworthiness. For regulators, they perform a stress test for extreme scenarios. For loan applicants, they generate diverse counterfactuals to guide them with steps toward a favorable classification from the model. Finally, for data scientists, they generate simple rules that accurately explain 70%–72% of the dataset. Their study provides a synthesized ML explanation framework for all stakeholders and is intended to accelerate the adoption of ML techniques in domains that would benefit from explanations of their predictions.","PeriodicalId":21927,"journal":{"name":"Social Science Research Network","volume":"9 1","pages":"9 - 39"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Social Science Research Network","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.4006840","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this work, the authors create machine learning (ML) models to forecast home equity credit risk for individuals using a real-world dataset and demonstrate methods to explain the output of these ML models to make them more accessible to the end user. They analyze the explainability for various stakeholders: loan companies, regulators, loan applicants, and data scientists, incorporating their different requirements with respect to explanations. For loan companies, they generate explanations for every model prediction of creditworthiness. For regulators, they perform a stress test for extreme scenarios. For loan applicants, they generate diverse counterfactuals to guide them with steps toward a favorable classification from the model. Finally, for data scientists, they generate simple rules that accurately explain 70%–72% of the dataset. Their study provides a synthesized ML explanation framework for all stakeholders and is intended to accelerate the adoption of ML techniques in domains that would benefit from explanations of their predictions.
在这项工作中,作者创建了机器学习(ML)模型,利用现实世界的数据集预测个人房屋净值信贷风险,并演示了解释这些 ML 模型输出的方法,使最终用户更容易理解这些模型。他们分析了不同利益相关者(贷款公司、监管机构、贷款申请人和数据科学家)的可解释性,并结合了他们对解释的不同要求。对于贷款公司,他们为每个信用度模型预测生成解释。对于监管机构,他们要对极端情况进行压力测试。对于贷款申请人,他们会生成不同的反事实,以指导他们从模型中获得有利的分类。最后,对于数据科学家,他们生成的简单规则可以准确解释 70%-72% 的数据集。他们的研究为所有利益相关者提供了一个综合的 ML 解释框架,旨在加快 ML 技术在各领域的应用,这些领域将受益于对其预测的解释。