Reassuring, Misleading, Debunking: Comparing Effects of XAI Methods on Human Decisions

IF 4.3 3区 材料科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Christina Humer, Andreas Hinterreiter, Benedikt Leichtmann, Martina Mara, Marc Streit
{"title":"Reassuring, Misleading, Debunking: Comparing Effects of XAI Methods on Human Decisions","authors":"Christina Humer, Andreas Hinterreiter, Benedikt Leichtmann, Martina Mara, Marc Streit","doi":"10.1145/3665647","DOIUrl":null,"url":null,"abstract":"\n Trust calibration is essential in AI-assisted decision-making. If human users understand the rationale on which an AI model has made a prediction, they can decide whether they consider this prediction reasonable. Especially in high-risk tasks such as mushroom hunting (where a wrong decision may be fatal), it is important that users make correct choices to trust or overrule the AI. Various explainable AI (XAI) methods are currently being discussed as potentially useful for facilitating understanding and subsequently calibrating user trust. So far, however, it remains unclear which approaches are most effective. In this paper, the effects of XAI methods on human AI-assisted decision-making in the high-risk task of mushroom picking were tested. For that endeavor, the effects of (\n i\n ) Grad-CAM attributions, (\n ii\n ) nearest-neighbor examples, and (\n iii\n ) network-dissection concepts were compared in a between-subjects experiment with\n \n \\(N=501\\)\n \n participants representing end-users of the system. In general, nearest-neighbor examples improved decision correctness the most. However, varying effects for different task items became apparent. All explanations seemed to be particularly effective when they revealed reasons to (\n i\n ) doubt a specific AI classification when the AI was wrong and (\n ii\n ) trust a specific AI classification when the AI was correct. Our results suggest that well-established methods, such as Grad-CAM attribution maps, might not be as beneficial to end users as expected and that XAI techniques for use in real-world scenarios must be chosen carefully.\n","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3665647","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Trust calibration is essential in AI-assisted decision-making. If human users understand the rationale on which an AI model has made a prediction, they can decide whether they consider this prediction reasonable. Especially in high-risk tasks such as mushroom hunting (where a wrong decision may be fatal), it is important that users make correct choices to trust or overrule the AI. Various explainable AI (XAI) methods are currently being discussed as potentially useful for facilitating understanding and subsequently calibrating user trust. So far, however, it remains unclear which approaches are most effective. In this paper, the effects of XAI methods on human AI-assisted decision-making in the high-risk task of mushroom picking were tested. For that endeavor, the effects of ( i ) Grad-CAM attributions, ( ii ) nearest-neighbor examples, and ( iii ) network-dissection concepts were compared in a between-subjects experiment with \(N=501\) participants representing end-users of the system. In general, nearest-neighbor examples improved decision correctness the most. However, varying effects for different task items became apparent. All explanations seemed to be particularly effective when they revealed reasons to ( i ) doubt a specific AI classification when the AI was wrong and ( ii ) trust a specific AI classification when the AI was correct. Our results suggest that well-established methods, such as Grad-CAM attribution maps, might not be as beneficial to end users as expected and that XAI techniques for use in real-world scenarios must be chosen carefully.
安慰、误导、揭穿:比较 XAI 方法对人类决策的影响
信任校准对人工智能辅助决策至关重要。如果人类用户了解人工智能模型做出预测的依据,他们就能决定是否认为这一预测是合理的。特别是在蘑菇采集等高风险任务中(错误的决定可能会致命),用户必须正确选择信任或否决人工智能。目前,人们正在讨论各种可解释的人工智能(XAI)方法,认为它们可能有助于促进理解并随后校准用户信任度。然而,到目前为止,人们仍不清楚哪种方法最有效。本文测试了在采蘑菇这一高风险任务中,XAI 方法对人类人工智能辅助决策的影响。为此,我们在一个主体间实验中比较了(i)Grad-CAM归因、(ii)最近邻示例和(iii)网络剖析概念的效果。总的来说,最近邻例子对决策正确性的提高最大。然而,不同任务项目的效果也不尽相同。所有解释似乎都特别有效,因为它们揭示了(i)当人工智能错误时怀疑特定人工智能分类的理由,以及(ii)当人工智能正确时信任特定人工智能分类的理由。我们的研究结果表明,Grad-CAM归因图等成熟的方法可能并不像预期的那样有利于最终用户,在真实世界场景中使用的XAI技术必须谨慎选择。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.20
自引率
4.30%
发文量
567
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信