Categorical and Continuous Features in Counterfactual Explanations of AI Systems

IF 3.6 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems Pub Date : 2024-06-20 DOI:10.1145/3673907

Greta Warren, Ruth M.J. Byrne, Mark T. Keane

{"title":"Categorical and Continuous Features in Counterfactual Explanations of AI Systems","authors":"Greta Warren, Ruth M.J. Byrne, Mark T. Keane","doi":"10.1145/3673907","DOIUrl":null,"url":null,"abstract":"<p>Recently, eXplainable AI (XAI) research has focused on the use of counterfactual explanations to address interpretability, algorithmic recourse, and bias in AI system decision-making. The developers of these algorithms claim they meet user requirements in generating counterfactual explanations with “plausible”, “actionable” or “causally important” features. However, few of these claims have been tested in controlled psychological studies. Hence, we know very little about which aspects of counterfactual explanations really help users understand the decisions of AI systems. Nor do we know whether counterfactual explanations are an advance on more traditional causal explanations that have a longer history in AI (e.g., in expert systems). Accordingly, we carried out three user studies to (i) test a fundamental distinction in feature-types, between categorical and continuous features, and (ii) compare the relative effectiveness of counterfactual and causal explanations. The studies used a simulated, automated decision-making app that determined safe driving limits after drinking alcohol, based on predicted blood alcohol content, where users’ responses were measured objectively (using predictive accuracy) and subjectively (using satisfaction and trust judgments). Study 1 (N = 127) showed that users understand explanations referring to categorical features more readily than those referring to continuous features. It also discovered a dissociation between objective and subjective measures: counterfactual explanations elicited higher accuracy than no-explanation controls but elicited no more accuracy than causal explanations, yet counterfactual explanations elicited greater satisfaction and trust than causal explanations. In Study 2 (N = 136) we transformed the continuous features of presented items to be categorical (i.e., binary) and found that these converted features led to highly accurate responding. Study 3 (N = 211) explicitly compared matched items involving either mixed features (i.e., a mix of categorical and continuous features) or categorical features (i.e., categorical and categorically-transformed continuous features), and found that users were more accurate when categorically-transformed features were used instead of continuous ones. It also replicated the dissociation between objective and subjective effects of explanations. The findings delineate important boundary conditions for current and future counterfactual explanation methods in XAI.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"11 1","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Interactive Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3673907","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, eXplainable AI (XAI) research has focused on the use of counterfactual explanations to address interpretability, algorithmic recourse, and bias in AI system decision-making. The developers of these algorithms claim they meet user requirements in generating counterfactual explanations with “plausible”, “actionable” or “causally important” features. However, few of these claims have been tested in controlled psychological studies. Hence, we know very little about which aspects of counterfactual explanations really help users understand the decisions of AI systems. Nor do we know whether counterfactual explanations are an advance on more traditional causal explanations that have a longer history in AI (e.g., in expert systems). Accordingly, we carried out three user studies to (i) test a fundamental distinction in feature-types, between categorical and continuous features, and (ii) compare the relative effectiveness of counterfactual and causal explanations. The studies used a simulated, automated decision-making app that determined safe driving limits after drinking alcohol, based on predicted blood alcohol content, where users’ responses were measured objectively (using predictive accuracy) and subjectively (using satisfaction and trust judgments). Study 1 (N = 127) showed that users understand explanations referring to categorical features more readily than those referring to continuous features. It also discovered a dissociation between objective and subjective measures: counterfactual explanations elicited higher accuracy than no-explanation controls but elicited no more accuracy than causal explanations, yet counterfactual explanations elicited greater satisfaction and trust than causal explanations. In Study 2 (N = 136) we transformed the continuous features of presented items to be categorical (i.e., binary) and found that these converted features led to highly accurate responding. Study 3 (N = 211) explicitly compared matched items involving either mixed features (i.e., a mix of categorical and continuous features) or categorical features (i.e., categorical and categorically-transformed continuous features), and found that users were more accurate when categorically-transformed features were used instead of continuous ones. It also replicated the dissociation between objective and subjective effects of explanations. The findings delineate important boundary conditions for current and future counterfactual explanation methods in XAI.

查看原文本刊更多论文

人工智能系统反事实解释中的分类特征和连续特征

最近，eXplainable AI（XAI）的研究重点是使用反事实解释来解决人工智能系统决策中的可解释性、算法追索权和偏差问题。这些算法的开发者声称，他们在生成具有 "可信"、"可操作 "或 "因果关系重要 "特征的反事实解释时满足了用户需求。然而，这些说法很少经过受控心理学研究的检验。因此，我们对反事实解释的哪些方面真正有助于用户理解人工智能系统的决策知之甚少。我们也不知道反事实解释是否比人工智能领域（如专家系统）历史更悠久的传统因果解释更先进。因此，我们进行了三项用户研究，以(i) 检验分类特征和连续特征在特征类型上的基本区别，(ii) 比较反事实解释和因果解释的相对有效性。这些研究使用了一个模拟的自动决策应用程序，该应用程序根据预测的血液酒精含量确定饮酒后的安全驾驶限制，并对用户的反应进行了客观测量（使用预测准确性）和主观测量（使用满意度和信任度判断）。研究 1（N=127）表明，与连续特征的解释相比，用户更容易理解涉及分类特征的解释。研究还发现了客观和主观测量之间的差异：反事实解释比无解释对照组的准确率更高，但准确率并不比因果解释高，但反事实解释比因果解释更能引起满意度和信任度。在研究 2（N = 136）中，我们将呈现项目的连续特征转换成了分类特征（即二元特征），结果发现这些转换后的特征导致了高度准确的反应。研究 3（N = 211）明确比较了涉及混合特征（即分类特征和连续特征的混合）或分类特征（即分类特征和经分类转换的连续特征）的匹配项目，结果发现，当使用经分类转换的特征而不是连续特征时，用户的回答更准确。研究还重复了解释的客观效果和主观效果之间的分离。这些发现为当前和未来 XAI 中的反事实解释方法划定了重要的边界条件。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Interactive Intelligent Systems Computer Science-Human-Computer Interaction

CiteScore

7.80

自引率

2.90%

发文量

期刊介绍： The ACM Transactions on Interactive Intelligent Systems (TiiS) publishes papers on research concerning the design, realization, or evaluation of interactive systems that incorporate some form of machine intelligence. TIIS articles come from a wide range of research areas and communities. An article can take any of several complementary views of interactive intelligent systems, focusing on: the intelligent technology, the interaction of users with the system, or both aspects at once.