A Formal Approach to Explainability

Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society Pub Date : 2019-01-27 DOI:10.1145/3306618.3314260

Lior Wolf, Tomer Galanti, Tamir Hazan

引用次数: 19

Abstract

We regard explanations as a blending of the input sample and the model's output and offer a few definitions that capture various desired properties of the function that generates these explanations. We study the links between these properties and between explanation-generating functions and intermediate representations of learned models and are able to show, for example, that if the activations of a given layer are consistent with an explanation, then so do all other subsequent layers. In addition, we study the intersection and union of explanations as a way to construct new explanations.

查看原文本刊更多论文

可解释性的正式方法

我们将解释视为输入样本和模型输出的混合，并提供一些定义，这些定义捕获了生成这些解释的函数的各种期望属性。我们研究了这些属性之间的联系，以及解释生成函数和学习模型的中间表示之间的联系，并且能够显示，例如，如果给定层的激活与解释一致，那么所有其他后续层也是如此。此外，我们还研究了解释的交集和联合，作为构建新解释的一种方式。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society

自引率

0.00%

发文量