Don't be fooled: label leakage in explanation methods and the importance of their quantitative evaluation.

Proceedings of machine learning research Pub Date : 2023-04-01

Neil Jethani, Adriel Saporta, Rajesh Ranganath

引用次数: 0

Abstract

Feature attribution methods identify which features of an input most influence a model's output. Most widely-used feature attribution methods (such as SHAP, LIME, and Grad-CAM) are "class-dependent" methods in that they generate a feature attribution vector as a function of class. In this work, we demonstrate that class-dependent methods can "leak" information about the selected class, making that class appear more likely than it is. Thus, an end user runs the risk of drawing false conclusions when interpreting an explanation generated by a class-dependent method. In contrast, we introduce "distribution-aware" methods, which favor explanations that keep the label's distribution close to its distribution given all features of the input. We introduce SHAP-KL and FastSHAP-KL, two baseline distribution-aware methods that compute Shapley values. Finally, we perform a comprehensive evaluation of seven class-dependent and three distribution-aware methods on three clinical datasets of different high-dimensional data types: images, biosignals, and text.

本刊更多论文

不要上当：标签泄漏在解释方法及其定量评价的重要性。

特征归因方法确定输入的哪些特征对模型的输出影响最大。大多数广泛使用的特征归因方法（如SHAP、LIME和Grad-CAM）都是“类相关”的方法，因为它们生成的特征归因向量是类的函数。在这项工作中，我们证明了依赖于类的方法可以“泄漏”关于所选类的信息，使该类看起来比实际更有可能出现。因此，终端用户在解释依赖类的方法生成的解释时，有得出错误结论的风险。相比之下，我们引入了“分布感知”方法，这种方法倾向于在给定输入的所有特征的情况下保持标签的分布接近其分布的解释。我们介绍了Shapley值计算的两种基线分布感知方法——Shapley - kl和FastSHAP-KL。最后，我们在三个不同高维数据类型的临床数据集（图像、生物信号和文本）上对七种类别依赖和三种分布感知方法进行了综合评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of machine learning research

自引率

0.00%

发文量