安慰、误导、揭穿：比较 XAI 方法对人类决策的影响

IF 4.3 3区材料科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

ACS Applied Electronic Materials Pub Date : 2024-05-22 DOI:10.1145/3665647

Christina Humer, Andreas Hinterreiter, Benedikt Leichtmann, Martina Mara, Marc Streit

{"title":"安慰、误导、揭穿：比较 XAI 方法对人类决策的影响","authors":"Christina Humer, Andreas Hinterreiter, Benedikt Leichtmann, Martina Mara, Marc Streit","doi":"10.1145/3665647","DOIUrl":null,"url":null,"abstract":"\n Trust calibration is essential in AI-assisted decision-making. If human users understand the rationale on which an AI model has made a prediction, they can decide whether they consider this prediction reasonable. Especially in high-risk tasks such as mushroom hunting (where a wrong decision may be fatal), it is important that users make correct choices to trust or overrule the AI. Various explainable AI (XAI) methods are currently being discussed as potentially useful for facilitating understanding and subsequently calibrating user trust. So far, however, it remains unclear which approaches are most effective. In this paper, the effects of XAI methods on human AI-assisted decision-making in the high-risk task of mushroom picking were tested. For that endeavor, the effects of (\n i\n ) Grad-CAM attributions, (\n ii\n ) nearest-neighbor examples, and (\n iii\n ) network-dissection concepts were compared in a between-subjects experiment with\n \n \\(N=501\\)\n \n participants representing end-users of the system. In general, nearest-neighbor examples improved decision correctness the most. However, varying effects for different task items became apparent. All explanations seemed to be particularly effective when they revealed reasons to (\n i\n ) doubt a specific AI classification when the AI was wrong and (\n ii\n ) trust a specific AI classification when the AI was correct. Our results suggest that well-established methods, such as Grad-CAM attribution maps, might not be as beneficial to end users as expected and that XAI techniques for use in real-world scenarios must be chosen carefully.\n","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reassuring, Misleading, Debunking: Comparing Effects of XAI Methods on Human Decisions\",\"authors\":\"Christina Humer, Andreas Hinterreiter, Benedikt Leichtmann, Martina Mara, Marc Streit\",\"doi\":\"10.1145/3665647\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Trust calibration is essential in AI-assisted decision-making. If human users understand the rationale on which an AI model has made a prediction, they can decide whether they consider this prediction reasonable. Especially in high-risk tasks such as mushroom hunting (where a wrong decision may be fatal), it is important that users make correct choices to trust or overrule the AI. Various explainable AI (XAI) methods are currently being discussed as potentially useful for facilitating understanding and subsequently calibrating user trust. So far, however, it remains unclear which approaches are most effective. In this paper, the effects of XAI methods on human AI-assisted decision-making in the high-risk task of mushroom picking were tested. For that endeavor, the effects of (\\n i\\n ) Grad-CAM attributions, (\\n ii\\n ) nearest-neighbor examples, and (\\n iii\\n ) network-dissection concepts were compared in a between-subjects experiment with\\n \\n \\\\(N=501\\\\)\\n \\n participants representing end-users of the system. In general, nearest-neighbor examples improved decision correctness the most. However, varying effects for different task items became apparent. All explanations seemed to be particularly effective when they revealed reasons to (\\n i\\n ) doubt a specific AI classification when the AI was wrong and (\\n ii\\n ) trust a specific AI classification when the AI was correct. Our results suggest that well-established methods, such as Grad-CAM attribution maps, might not be as beneficial to end users as expected and that XAI techniques for use in real-world scenarios must be chosen carefully.\\n\",\"PeriodicalId\":3,\"journal\":{\"name\":\"ACS Applied Electronic Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-05-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Electronic Materials\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3665647\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3665647","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

信任校准对人工智能辅助决策至关重要。如果人类用户了解人工智能模型做出预测的依据，他们就能决定是否认为这一预测是合理的。特别是在蘑菇采集等高风险任务中（错误的决定可能会致命），用户必须正确选择信任或否决人工智能。目前，人们正在讨论各种可解释的人工智能（XAI）方法，认为它们可能有助于促进理解并随后校准用户信任度。然而，到目前为止，人们仍不清楚哪种方法最有效。本文测试了在采蘑菇这一高风险任务中，XAI 方法对人类人工智能辅助决策的影响。为此，我们在一个主体间实验中比较了（i）Grad-CAM归因、（ii）最近邻示例和（iii）网络剖析概念的效果。总的来说，最近邻例子对决策正确性的提高最大。然而，不同任务项目的效果也不尽相同。所有解释似乎都特别有效，因为它们揭示了（i）当人工智能错误时怀疑特定人工智能分类的理由，以及（ii）当人工智能正确时信任特定人工智能分类的理由。我们的研究结果表明，Grad-CAM归因图等成熟的方法可能并不像预期的那样有利于最终用户，在真实世界场景中使用的XAI技术必须谨慎选择。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reassuring, Misleading, Debunking: Comparing Effects of XAI Methods on Human Decisions

Trust calibration is essential in AI-assisted decision-making. If human users understand the rationale on which an AI model has made a prediction, they can decide whether they consider this prediction reasonable. Especially in high-risk tasks such as mushroom hunting (where a wrong decision may be fatal), it is important that users make correct choices to trust or overrule the AI. Various explainable AI (XAI) methods are currently being discussed as potentially useful for facilitating understanding and subsequently calibrating user trust. So far, however, it remains unclear which approaches are most effective. In this paper, the effects of XAI methods on human AI-assisted decision-making in the high-risk task of mushroom picking were tested. For that endeavor, the effects of ( i ) Grad-CAM attributions, ( ii ) nearest-neighbor examples, and ( iii ) network-dissection concepts were compared in a between-subjects experiment with \(N=501\) participants representing end-users of the system. In general, nearest-neighbor examples improved decision correctness the most. However, varying effects for different task items became apparent. All explanations seemed to be particularly effective when they revealed reasons to ( i ) doubt a specific AI classification when the AI was wrong and ( ii ) trust a specific AI classification when the AI was correct. Our results suggest that well-established methods, such as Grad-CAM attribution maps, might not be as beneficial to end users as expected and that XAI techniques for use in real-world scenarios must be chosen carefully.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACS Applied Electronic Materials Multiple-

CiteScore

7.20

自引率

4.30%

发文量

567