From “no clear winner” to an effective Explainable Artificial Intelligence process: An empirical journey

Applied AI letters Pub Date : 2021-07-18 DOI:10.1002/ail2.36

Jonathan Dodge, Andrew Anderson, Roli Khanna, Jed Irvine, Rupika Dikkala, Kin-Ho Lam, Delyar Tabatabai, Anita Ruangrotsakun, Zeyad Shureih, Minsuk Kahng, Alan Fern, Margaret Burnett

{"title":"From “no clear winner” to an effective Explainable Artificial Intelligence process: An empirical journey","authors":"Jonathan Dodge, Andrew Anderson, Roli Khanna, Jed Irvine, Rupika Dikkala, Kin-Ho Lam, Delyar Tabatabai, Anita Ruangrotsakun, Zeyad Shureih, Minsuk Kahng, Alan Fern, Margaret Burnett","doi":"10.1002/ail2.36","DOIUrl":null,"url":null,"abstract":"<p>“In what circumstances would you want this AI to make decisions on your behalf?” We have been investigating how to enable a user of an Artificial Intelligence-powered system to answer questions like this through a series of empirical studies, a group of which we summarize here. We began the series by (a) comparing four explanation configurations of saliency explanations and/or reward explanations. From this study we learned that, although some configurations had significant strengths, no one configuration was a clear “winner.” This result led us to hypothesize that one reason for the low success rates Explainable AI (XAI) research has in enabling users to create a coherent mental model is that the AI itself does not have a coherent model. This hypothesis led us to (b) build a model-based agent, to compare explaining it with explaining a model-free agent. Our results were encouraging, but we then realized that participants' cognitive energy was being sapped by having to create not only a mental model, but also a process by which to create that mental model. This realization led us to (c) create such a process (which we term <i>After-Action Review for AI</i> or “AAR/AI”) for them, integrate it into the explanation environment, and compare participants' success with AAR/AI scaffolding vs without it. Our AAR/AI studies' results showed that AAR/AI participants were more effective assessing the AI than non-AAR/AI participants, with significantly better precision and significantly better recall at finding the AI's reasoning flaws.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"2 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.36","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied AI letters","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ail2.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

“In what circumstances would you want this AI to make decisions on your behalf?” We have been investigating how to enable a user of an Artificial Intelligence-powered system to answer questions like this through a series of empirical studies, a group of which we summarize here. We began the series by (a) comparing four explanation configurations of saliency explanations and/or reward explanations. From this study we learned that, although some configurations had significant strengths, no one configuration was a clear “winner.” This result led us to hypothesize that one reason for the low success rates Explainable AI (XAI) research has in enabling users to create a coherent mental model is that the AI itself does not have a coherent model. This hypothesis led us to (b) build a model-based agent, to compare explaining it with explaining a model-free agent. Our results were encouraging, but we then realized that participants' cognitive energy was being sapped by having to create not only a mental model, but also a process by which to create that mental model. This realization led us to (c) create such a process (which we term After-Action Review for AI or “AAR/AI”) for them, integrate it into the explanation environment, and compare participants' success with AAR/AI scaffolding vs without it. Our AAR/AI studies' results showed that AAR/AI participants were more effective assessing the AI than non-AAR/AI participants, with significantly better precision and significantly better recall at finding the AI's reasoning flaws.

Abstract Image

查看原文本刊更多论文

从“没有明确的赢家”到有效的可解释的人工智能过程:经验之旅

“在什么情况下，你希望这个人工智能代表你做决定?”我们一直在研究如何让人工智能驱动系统的用户通过一系列实证研究来回答这样的问题，我们在这里总结了其中的一组。我们首先比较了显著性解释和/或奖励解释的四种解释配置。从这项研究中我们了解到，尽管一些配置具有显著的优势，但没有一种配置是明确的“赢家”。这一结果让我们假设，可解释人工智能(Explainable AI, XAI)研究在帮助用户创建连贯的心智模型方面成功率低的一个原因是，人工智能本身没有一个连贯的模型。这个假设导致我们(b)建立一个基于模型的代理，并将解释它与解释无模型的代理进行比较。我们的结果令人鼓舞，但我们随后意识到，参与者的认知能量正在被消耗，因为他们不仅要创建一个心智模型，还要创建一个心智模型的过程。这种认识使我们(c)为他们创建这样一个过程(我们称之为AI的事后审查或“AAR/AI”)，将其集成到解释环境中，并比较参与者在AAR/AI框架下的成功与没有它的情况。我们的AAR/AI研究结果表明，AAR/AI参与者比非AAR/AI参与者更有效地评估AI，在发现AI的推理缺陷方面具有更高的精度和更高的召回率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied AI letters

自引率

0.00%

发文量