Towards Compositional Interpretability for XAI

arXiv - MATH - Category Theory Pub Date : 2024-06-25 DOI:arxiv-2406.17583

Sean Tull, Robin Lorenz, Stephen Clark, Ilyas Khan, Bob Coecke

{"title":"Towards Compositional Interpretability for XAI","authors":"Sean Tull, Robin Lorenz, Stephen Clark, Ilyas Khan, Bob Coecke","doi":"arxiv-2406.17583","DOIUrl":null,"url":null,"abstract":"Artificial intelligence (AI) is currently based largely on black-box machine\nlearning models which lack interpretability. The field of eXplainable AI (XAI)\nstrives to address this major concern, being critical in high-stakes areas such\nas the finance, legal and health sectors. We present an approach to defining AI models and their interpretability based\non category theory. For this we employ the notion of a compositional model,\nwhich sees a model in terms of formal string diagrams which capture its\nabstract structure together with its concrete implementation. This\ncomprehensive view incorporates deterministic, probabilistic and quantum\nmodels. We compare a wide range of AI models as compositional models, including\nlinear and rule-based models, (recurrent) neural networks, transformers, VAEs,\nand causal and DisCoCirc models. Next we give a definition of interpretation of a model in terms of its\ncompositional structure, demonstrating how to analyse the interpretability of a\nmodel, and using this to clarify common themes in XAI. We find that what makes\nthe standard 'intrinsically interpretable' models so transparent is brought out\nmost clearly diagrammatically. This leads us to the more general notion of\ncompositionally-interpretable (CI) models, which additionally include, for\ninstance, causal, conceptual space, and DisCoCirc models. We next demonstrate the explainability benefits of CI models. Firstly, their\ncompositional structure may allow the computation of other quantities of\ninterest, and may facilitate inference from the model to the modelled\nphenomenon by matching its structure. Secondly, they allow for diagrammatic\nexplanations for their behaviour, based on influence constraints, diagram\nsurgery and rewrite explanations. Finally, we discuss many future directions\nfor the approach, raising the question of how to learn such meaningfully\nstructured models in practice.","PeriodicalId":501135,"journal":{"name":"arXiv - MATH - Category Theory","volume":"26 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Category Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.17583","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Artificial intelligence (AI) is currently based largely on black-box machine learning models which lack interpretability. The field of eXplainable AI (XAI) strives to address this major concern, being critical in high-stakes areas such as the finance, legal and health sectors. We present an approach to defining AI models and their interpretability based on category theory. For this we employ the notion of a compositional model, which sees a model in terms of formal string diagrams which capture its abstract structure together with its concrete implementation. This comprehensive view incorporates deterministic, probabilistic and quantum models. We compare a wide range of AI models as compositional models, including linear and rule-based models, (recurrent) neural networks, transformers, VAEs, and causal and DisCoCirc models. Next we give a definition of interpretation of a model in terms of its compositional structure, demonstrating how to analyse the interpretability of a model, and using this to clarify common themes in XAI. We find that what makes the standard 'intrinsically interpretable' models so transparent is brought out most clearly diagrammatically. This leads us to the more general notion of compositionally-interpretable (CI) models, which additionally include, for instance, causal, conceptual space, and DisCoCirc models. We next demonstrate the explainability benefits of CI models. Firstly, their compositional structure may allow the computation of other quantities of interest, and may facilitate inference from the model to the modelled phenomenon by matching its structure. Secondly, they allow for diagrammatic explanations for their behaviour, based on influence constraints, diagram surgery and rewrite explanations. Finally, we discuss many future directions for the approach, raising the question of how to learn such meaningfully structured models in practice.

查看原文本刊更多论文

实现 XAI 的可组合解释性

人工智能（AI）目前主要基于缺乏可解释性的黑盒机器学习模型。可解释人工智能（XAI）领域致力于解决这一重大问题，它在金融、法律和卫生等高风险领域至关重要。我们提出了一种基于范畴理论定义人工智能模型及其可解释性的方法。为此，我们采用了组合模型的概念，即用形式化的字符串图来表示模型，这些字符串图捕捉了模型的抽象结构及其具体实现。这种全面的观点包含了确定性模型、概率模型和量子模型。我们比较了作为组合模型的各种人工智能模型，包括线性模型和基于规则的模型、（递归）神经网络、变压器、VAE、因果模型和 DisCoCirc 模型。接下来，我们根据模型的组合结构给出了解释模型的定义，演示了如何分析模型的可解释性，并以此阐明了 XAI 中的共同主题。我们发现，使标准的 "本质上可解释 "模型如此透明的原因，在图解中得到了最清晰的体现。这就引出了更广义的组合可解释（CI）模型的概念，例如，它还包括因果模型、概念空间模型和 DisCoCirc 模型。接下来，我们将展示 CI 模型的可解释性优势。首先，CI 模型的组合结构可以计算其它感兴趣的量，并通过匹配模型的结构促进从模型到被建模现象的推理。其次，基于影响约束、图解手术和重写解释，它们允许对其行为进行图解解释。最后，我们讨论了该方法的许多未来方向，提出了如何在实践中学习这种有意义的结构化模型的问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - MATH - Category Theory

自引率

0.00%

发文量