Evaluating perceptual and semantic interpretability of saliency methods: A case study of melanoma

Applied AI letters Pub Date : 2022-09-13 DOI:10.1002/ail2.77
Harshit Bokadia, Scott Cheng-Hsin Yang, Zhaobin Li, Tomas Folke, Patrick Shafto
{"title":"Evaluating perceptual and semantic interpretability of saliency methods: A case study of melanoma","authors":"Harshit Bokadia,&nbsp;Scott Cheng-Hsin Yang,&nbsp;Zhaobin Li,&nbsp;Tomas Folke,&nbsp;Patrick Shafto","doi":"10.1002/ail2.77","DOIUrl":null,"url":null,"abstract":"<p>In order to be useful, XAI explanations have to be faithful to the AI system they seek to elucidate and also interpretable to the people that engage with them. There exist multiple algorithmic methods for assessing faithfulness, but this is not so for interpretability, which is typically only assessed through expensive user studies. Here we propose two complementary metrics to algorithmically evaluate the interpretability of saliency map explanations. One metric assesses perceptual interpretability by quantifying the visual coherence of the saliency map. The second metric assesses semantic interpretability by capturing the degree of overlap between the saliency map and textbook features—features human experts use to make a classification. We use a melanoma dataset and a deep-neural network classifier as a case-study to explore how our two interpretability metrics relate to each other and a faithfulness metric. Across six commonly used saliency methods, we find that none achieves high scores across all three metrics for all test images, but that different methods perform well in different regions of the data distribution. This variation between methods can be leveraged to consistently achieve high interpretability and faithfulness by using our metrics to inform saliency mask selection on a case-by-case basis. Our interpretability metrics provide a new way to evaluate saliency-based explanations and allow for the adaptive combination of saliency-based explanation methods.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.77","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied AI letters","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ail2.77","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

In order to be useful, XAI explanations have to be faithful to the AI system they seek to elucidate and also interpretable to the people that engage with them. There exist multiple algorithmic methods for assessing faithfulness, but this is not so for interpretability, which is typically only assessed through expensive user studies. Here we propose two complementary metrics to algorithmically evaluate the interpretability of saliency map explanations. One metric assesses perceptual interpretability by quantifying the visual coherence of the saliency map. The second metric assesses semantic interpretability by capturing the degree of overlap between the saliency map and textbook features—features human experts use to make a classification. We use a melanoma dataset and a deep-neural network classifier as a case-study to explore how our two interpretability metrics relate to each other and a faithfulness metric. Across six commonly used saliency methods, we find that none achieves high scores across all three metrics for all test images, but that different methods perform well in different regions of the data distribution. This variation between methods can be leveraged to consistently achieve high interpretability and faithfulness by using our metrics to inform saliency mask selection on a case-by-case basis. Our interpretability metrics provide a new way to evaluate saliency-based explanations and allow for the adaptive combination of saliency-based explanation methods.

Abstract Image

评估显著性方法的感知和语义可解释性:黑色素瘤的案例研究
为了发挥作用,XAI解释必须忠实于它们试图解释的AI系统,并且能够让参与其中的人理解。存在多种算法方法来评估忠实度,但对于可解释性而言并非如此,这通常只能通过昂贵的用户研究来评估。在这里,我们提出了两个互补的指标,以算法评估显著性地图解释的可解释性。一种度量通过量化显著性图的视觉一致性来评估感知可解释性。第二个指标通过捕捉显著性图和教科书特征之间的重叠程度来评估语义可解释性,这些特征是人类专家用来进行分类的。我们使用黑色素瘤数据集和深度神经网络分类器作为案例研究,探索我们的两个可解释性指标如何相互关联以及可信度指标。在六种常用的显著性方法中,我们发现没有一种方法能够在所有测试图像的所有三个度量中获得高分,但是不同的方法在数据分布的不同区域表现良好。可以利用方法之间的这种差异,通过使用我们的指标来根据具体情况通知显着掩码选择,从而始终如一地实现高可解释性和可靠性。我们的可解释性指标提供了一种新的方法来评估基于显著性的解释,并允许基于显著性的解释方法的自适应组合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信