ConceptGlassbox: Guided Concept-Based Explanation for Deep Neural Networks

IF 4.3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Cognitive Computation Pub Date : 2024-05-03 DOI:10.1007/s12559-024-10262-8

Radwa El Shawi

{"title":"ConceptGlassbox: Guided Concept-Based Explanation for Deep Neural Networks","authors":"Radwa El Shawi","doi":"10.1007/s12559-024-10262-8","DOIUrl":null,"url":null,"abstract":"<p>Various industries and fields have utilized machine learning models, particularly those that demand a significant degree of accountability and transparency. With the introduction of the General Data Protection Regulation (GDPR), it has become imperative for machine learning model predictions to be both plausible and verifiable. One approach to explaining these predictions involves assigning an importance score to each input element. Another category aims to quantify the importance of human-understandable concepts to explain global and local model behaviours. The way concepts are constructed in such concept-based explanation techniques lacks inherent interpretability. Additionally, the magnitude and diversity of the discovered concepts make it difficult for machine learning practitioners to comprehend and make sense of the concept space. To this end, we introduce ConceptGlassbox, a novel local explanation framework that seeks to learn high-level transparent concept definitions. Our approach leverages human knowledge and feedback to facilitate the acquisition of concepts with minimal human labelling effort. The ConceptGlassbox learns concepts consistent with the user’s understanding of a concept’s meaning. It then dissects the evidence for the prediction by identifying the key concepts the black-box model uses to arrive at its decision regarding the instance being explained. Additionally, ConceptGlassbox produces counterfactual explanations, proposing the smallest changes to the instance’s concept-based explanation that would result in a counterfactual decision as specified by the user. Our systematic experiments confirm that ConceptGlassbox successfully discovers relevant and comprehensible concepts that are important for neural network predictions.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":"33 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Computation","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s12559-024-10262-8","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Various industries and fields have utilized machine learning models, particularly those that demand a significant degree of accountability and transparency. With the introduction of the General Data Protection Regulation (GDPR), it has become imperative for machine learning model predictions to be both plausible and verifiable. One approach to explaining these predictions involves assigning an importance score to each input element. Another category aims to quantify the importance of human-understandable concepts to explain global and local model behaviours. The way concepts are constructed in such concept-based explanation techniques lacks inherent interpretability. Additionally, the magnitude and diversity of the discovered concepts make it difficult for machine learning practitioners to comprehend and make sense of the concept space. To this end, we introduce ConceptGlassbox, a novel local explanation framework that seeks to learn high-level transparent concept definitions. Our approach leverages human knowledge and feedback to facilitate the acquisition of concepts with minimal human labelling effort. The ConceptGlassbox learns concepts consistent with the user’s understanding of a concept’s meaning. It then dissects the evidence for the prediction by identifying the key concepts the black-box model uses to arrive at its decision regarding the instance being explained. Additionally, ConceptGlassbox produces counterfactual explanations, proposing the smallest changes to the instance’s concept-based explanation that would result in a counterfactual decision as specified by the user. Our systematic experiments confirm that ConceptGlassbox successfully discovers relevant and comprehensible concepts that are important for neural network predictions.

Abstract Image

查看原文本刊更多论文

ConceptGlassbox：为深度神经网络提供基于概念的引导式解释

各行各业都在使用机器学习模型，尤其是那些要求高度问责和透明的行业。随着《通用数据保护条例》（GDPR）的出台，机器学习模型的预测必须可信且可验证。解释这些预测的一种方法是为每个输入元素分配一个重要性分数。另一类方法旨在量化人类可理解概念的重要性，以解释全局和局部模型行为。在这类基于概念的解释技术中，概念的构建方式缺乏内在的可解释性。此外，已发现概念的规模和多样性也使机器学习从业人员难以理解和理解概念空间。为此，我们引入了 ConceptGlassbox，这是一个新颖的本地解释框架，旨在学习高级别的透明概念定义。我们的方法利用人类知识和反馈，以最小的人工标注工作来促进概念的获取。ConceptGlassbox 根据用户对概念含义的理解来学习概念。然后，它通过识别黑盒模型用来得出有关被解释实例的决定的关键概念，来剖析预测的证据。此外，ConceptGlassbox 还能生成反事实解释，对基于概念的实例解释提出最小的改动，以实现用户指定的反事实决策。我们的系统实验证实，ConceptGlassbox 成功地发现了对神经网络预测非常重要的相关可理解概念。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Cognitive Computation COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-NEUROSCIENCES

CiteScore

9.30

自引率

3.70%

发文量

116

审稿时长

>12 weeks

期刊介绍： Cognitive Computation is an international, peer-reviewed, interdisciplinary journal that publishes cutting-edge articles describing original basic and applied work involving biologically-inspired computational accounts of all aspects of natural and artificial cognitive systems. It provides a new platform for the dissemination of research, current practices and future trends in the emerging discipline of cognitive computation that bridges the gap between life sciences, social sciences, engineering, physical and mathematical sciences, and humanities.