The hidden assumptions behind counterfactual explanations and principal reasons

Solon Barocas, Andrew D. Selbst, Manish Raghavan
{"title":"The hidden assumptions behind counterfactual explanations and principal reasons","authors":"Solon Barocas, Andrew D. Selbst, Manish Raghavan","doi":"10.1145/3351095.3372830","DOIUrl":null,"url":null,"abstract":"Counterfactual explanations are gaining prominence within technical, legal, and business circles as a way to explain the decisions of a machine learning model. These explanations share a trait with the long-established \"principal reason\" explanations required by U.S. credit laws: they both explain a decision by highlighting a set of features deemed most relevant---and withholding others. These \"feature-highlighting explanations\" have several desirable properties: They place no constraints on model complexity, do not require model disclosure, detail what needed to be different to achieve a different decision, and seem to automate compliance with the law. But they are far more complex and subjective than they appear. In this paper, we demonstrate that the utility of feature-highlighting explanations relies on a number of easily overlooked assumptions: that the recommended change in feature values clearly maps to real-world actions, that features can be made commensurate by looking only at the distribution of the training data, that features are only relevant to the decision at hand, and that the underlying model is stable over time, monotonic, and limited to binary outcomes. We then explore several consequences of acknowledging and attempting to address these assumptions, including a paradox in the way that feature-highlighting explanations aim to respect autonomy, the unchecked power that feature-highlighting explanations grant decision makers, and a tension between making these explanations useful and the need to keep the model hidden. While new research suggests several ways that feature-highlighting explanations can work around some of the problems that we identify, the disconnect between features in the model and actions in the real world---and the subjective choices necessary to compensate for this---must be understood before these techniques can be usefully implemented.","PeriodicalId":377829,"journal":{"name":"Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"173","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3351095.3372830","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 173

Abstract

Counterfactual explanations are gaining prominence within technical, legal, and business circles as a way to explain the decisions of a machine learning model. These explanations share a trait with the long-established "principal reason" explanations required by U.S. credit laws: they both explain a decision by highlighting a set of features deemed most relevant---and withholding others. These "feature-highlighting explanations" have several desirable properties: They place no constraints on model complexity, do not require model disclosure, detail what needed to be different to achieve a different decision, and seem to automate compliance with the law. But they are far more complex and subjective than they appear. In this paper, we demonstrate that the utility of feature-highlighting explanations relies on a number of easily overlooked assumptions: that the recommended change in feature values clearly maps to real-world actions, that features can be made commensurate by looking only at the distribution of the training data, that features are only relevant to the decision at hand, and that the underlying model is stable over time, monotonic, and limited to binary outcomes. We then explore several consequences of acknowledging and attempting to address these assumptions, including a paradox in the way that feature-highlighting explanations aim to respect autonomy, the unchecked power that feature-highlighting explanations grant decision makers, and a tension between making these explanations useful and the need to keep the model hidden. While new research suggests several ways that feature-highlighting explanations can work around some of the problems that we identify, the disconnect between features in the model and actions in the real world---and the subjective choices necessary to compensate for this---must be understood before these techniques can be usefully implemented.
反事实解释和主要原因背后隐藏的假设
反事实解释作为一种解释机器学习模型决策的方法,在技术、法律和商业领域越来越受到重视。这些解释与美国信贷法律要求的长期确立的“主要原因”解释有一个共同点:它们都是通过强调一组被认为最相关的特征来解释一个决定,而忽略了其他特征。这些“突出特征的解释”有几个令人满意的特性:它们对模型复杂性没有限制,不需要模型披露,详细说明了实现不同决策所需的不同之处,并且似乎自动遵守了法律。但它们比表面上要复杂和主观得多。在本文中,我们证明了特征突出解释的效用依赖于一些容易被忽视的假设:特征值的建议变化清楚地映射到现实世界的行为,特征可以通过只看训练数据的分布而相称,特征只与手头的决策相关,并且底层模型随着时间的推移是稳定的,单调的,并且仅限于二进制结果。然后,我们探讨了承认并试图解决这些假设的几个后果,包括突出特征的解释旨在尊重自主权的方式的悖论,突出特征的解释赋予决策者不受约束的权力,以及使这些解释有用与需要保持模型隐藏之间的紧张关系。虽然新的研究表明,突出特征的解释可以通过几种方式解决我们发现的一些问题,但在有效实施这些技术之前,必须理解模型中的特征与现实世界中的行为之间的脱节,以及弥补这一点所必需的主观选择。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信