CERTIFAI: A Common Framework to Provide Explanations and Analyse the Fairness and Robustness of Black-box Models

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society Pub Date : 2019-05-20 DOI:10.1145/3375627.3375812

Shubham Sharma, Jette Henderson, Joydeep Ghosh

{"title":"CERTIFAI: A Common Framework to Provide Explanations and Analyse the Fairness and Robustness of Black-box Models","authors":"Shubham Sharma, Jette Henderson, Joydeep Ghosh","doi":"10.1145/3375627.3375812","DOIUrl":null,"url":null,"abstract":"Concerns within the machine learning community and external pressures from regulators over the vulnerabilities of machine learning algorithms have spurred on the fields of explainability, robustness, and fairness. Often, issues in explainability, robustness, and fairness are confined to their specific sub-fields and few tools exist for model developers to use to simultaneously build their modeling pipelines in a transparent, accountable, and fair way. This can lead to a bottleneck on the model developer's side as they must juggle multiple methods to evaluate their algorithms. In this paper, we present a single framework for analyzing the robustness, fairness, and explainability of a classifier. The framework, which is based on the generation of counterfactual explanations through a custom genetic algorithm, is flexible, model-agnostic, and does not require access to model internals. The framework allows the user to calculate robustness and fairness scores for individual models and generate explanations for individual predictions which provide a means for actionable recourse (changes to an input to help get a desired outcome). This is the first time that a unified tool has been developed to address three key issues pertaining towards building a responsible artificial intelligence system.","PeriodicalId":93612,"journal":{"name":"Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society","volume":"22 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"143","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3375627.3375812","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 143

Abstract

Concerns within the machine learning community and external pressures from regulators over the vulnerabilities of machine learning algorithms have spurred on the fields of explainability, robustness, and fairness. Often, issues in explainability, robustness, and fairness are confined to their specific sub-fields and few tools exist for model developers to use to simultaneously build their modeling pipelines in a transparent, accountable, and fair way. This can lead to a bottleneck on the model developer's side as they must juggle multiple methods to evaluate their algorithms. In this paper, we present a single framework for analyzing the robustness, fairness, and explainability of a classifier. The framework, which is based on the generation of counterfactual explanations through a custom genetic algorithm, is flexible, model-agnostic, and does not require access to model internals. The framework allows the user to calculate robustness and fairness scores for individual models and generate explanations for individual predictions which provide a means for actionable recourse (changes to an input to help get a desired outcome). This is the first time that a unified tool has been developed to address three key issues pertaining towards building a responsible artificial intelligence system.

查看原文本刊更多论文

一个提供解释和分析黑盒模型公平性和鲁棒性的通用框架

机器学习社区内部的担忧以及监管机构对机器学习算法脆弱性的外部压力，刺激了可解释性、鲁棒性和公平性等领域的发展。通常，可解释性、健壮性和公平性方面的问题局限于它们特定的子领域，并且很少有工具可供模型开发人员使用，以透明、负责和公平的方式同时构建他们的建模管道。这可能会导致模型开发人员的瓶颈，因为他们必须同时使用多种方法来评估他们的算法。在本文中，我们提出了一个单一的框架来分析分类器的鲁棒性，公平性和可解释性。该框架是基于通过自定义遗传算法生成反事实解释的，它是灵活的、模型不可知的，并且不需要访问模型内部。该框架允许用户计算单个模型的稳健性和公平性分数，并为单个预测生成解释，从而为可操作的追索权提供手段(更改输入以帮助获得期望的结果)。这是第一次开发一个统一的工具来解决与建立一个负责任的人工智能系统有关的三个关键问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society

自引率

0.00%

发文量