Consistent Counterfactual Explanations via Anomaly Control and Data Coherence

IEEE transactions on artificial intelligence Pub Date : 2024-11-19 DOI:10.1109/TAI.2024.3496616

Maria Movin;Federico Siciliano;Rui Ferreira;Fabrizio Silvestri;Gabriele Tolomei

{"title":"Consistent Counterfactual Explanations via Anomaly Control and Data Coherence","authors":"Maria Movin;Federico Siciliano;Rui Ferreira;Fabrizio Silvestri;Gabriele Tolomei","doi":"10.1109/TAI.2024.3496616","DOIUrl":null,"url":null,"abstract":"Algorithmic recourses are popular methods to provide individuals impacted by machine learning models with recommendations on feasible actions for a more favorable prediction. Most of the previous algorithmic recourse methods work under the assumption that the predictive model does not change over time. However, in reality, models in deployment may both be periodically retrained and have their architecture changed. Therefore, it is desirable that the recourse should remain valid when such a model update occurs, unless new evidence arises. We call this feature <italic>consistency</i>. This article presents anomaly control and data coherence (ACDC), a novel model-agnostic recourse method that generates counterfactual explanations, i.e., instance-level recourses. ACDC is inspired by anomaly detection methods and uses a one-class classifier to aid the search for valid, consistent, and feasible counterfactual explanations. The one-class classifier asserts that the generated counterfactual explanations lie on the data manifold and are not outliers of the target class. We compare ACDC against several state-of-the-art recourse methods across four datasets. Our experiments show that ACDC outperforms baselines both in generating consistent counterfactual explanations, and in generating feasible and plausible counterfactual explanations, while still having proximity measures similar to the baseline methods targeting the data manifold.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 4","pages":"794-804"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10758426/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Algorithmic recourses are popular methods to provide individuals impacted by machine learning models with recommendations on feasible actions for a more favorable prediction. Most of the previous algorithmic recourse methods work under the assumption that the predictive model does not change over time. However, in reality, models in deployment may both be periodically retrained and have their architecture changed. Therefore, it is desirable that the recourse should remain valid when such a model update occurs, unless new evidence arises. We call this feature consistency. This article presents anomaly control and data coherence (ACDC), a novel model-agnostic recourse method that generates counterfactual explanations, i.e., instance-level recourses. ACDC is inspired by anomaly detection methods and uses a one-class classifier to aid the search for valid, consistent, and feasible counterfactual explanations. The one-class classifier asserts that the generated counterfactual explanations lie on the data manifold and are not outliers of the target class. We compare ACDC against several state-of-the-art recourse methods across four datasets. Our experiments show that ACDC outperforms baselines both in generating consistent counterfactual explanations, and in generating feasible and plausible counterfactual explanations, while still having proximity measures similar to the baseline methods targeting the data manifold.

查看原文本刊更多论文

基于异常控制和数据一致性的一致反事实解释

算法资源是一种流行的方法，为受机器学习模型影响的个人提供可行行动的建议，以获得更有利的预测。以前的大多数算法追索方法都是在预测模型不随时间变化的假设下工作的。然而，在现实中，部署中的模型可能会周期性地重新训练，并更改其体系结构。因此，除非出现新的证据，否则在发生这种模型更新时，追索权应保持有效是可取的。我们把这种特性称为一致性。本文介绍了异常控制和数据一致性（ACDC），这是一种新的模型不可知的追索方法，它产生反事实解释，即实例级追索。ACDC受到异常检测方法的启发，并使用单类分类器来帮助搜索有效、一致和可行的反事实解释。单类分类器断言生成的反事实解释位于数据流形上，而不是目标类的异常值。我们将ACDC与四个数据集的几种最先进的资源方法进行比较。我们的实验表明，ACDC在生成一致的反事实解释和生成可行和似是而非的反事实解释方面都优于基线，同时仍然具有类似于针对数据流形的基线方法的接近度量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on artificial intelligence

CiteScore

7.70

自引率

0.00%

发文量