Sean Tull, Robin Lorenz, Stephen Clark, Ilyas Khan, Bob Coecke
{"title":"实现 XAI 的可组合解释性","authors":"Sean Tull, Robin Lorenz, Stephen Clark, Ilyas Khan, Bob Coecke","doi":"arxiv-2406.17583","DOIUrl":null,"url":null,"abstract":"Artificial intelligence (AI) is currently based largely on black-box machine\nlearning models which lack interpretability. The field of eXplainable AI (XAI)\nstrives to address this major concern, being critical in high-stakes areas such\nas the finance, legal and health sectors. We present an approach to defining AI models and their interpretability based\non category theory. For this we employ the notion of a compositional model,\nwhich sees a model in terms of formal string diagrams which capture its\nabstract structure together with its concrete implementation. This\ncomprehensive view incorporates deterministic, probabilistic and quantum\nmodels. We compare a wide range of AI models as compositional models, including\nlinear and rule-based models, (recurrent) neural networks, transformers, VAEs,\nand causal and DisCoCirc models. Next we give a definition of interpretation of a model in terms of its\ncompositional structure, demonstrating how to analyse the interpretability of a\nmodel, and using this to clarify common themes in XAI. We find that what makes\nthe standard 'intrinsically interpretable' models so transparent is brought out\nmost clearly diagrammatically. This leads us to the more general notion of\ncompositionally-interpretable (CI) models, which additionally include, for\ninstance, causal, conceptual space, and DisCoCirc models. We next demonstrate the explainability benefits of CI models. Firstly, their\ncompositional structure may allow the computation of other quantities of\ninterest, and may facilitate inference from the model to the modelled\nphenomenon by matching its structure. Secondly, they allow for diagrammatic\nexplanations for their behaviour, based on influence constraints, diagram\nsurgery and rewrite explanations. Finally, we discuss many future directions\nfor the approach, raising the question of how to learn such meaningfully\nstructured models in practice.","PeriodicalId":501135,"journal":{"name":"arXiv - MATH - Category Theory","volume":"26 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards Compositional Interpretability for XAI\",\"authors\":\"Sean Tull, Robin Lorenz, Stephen Clark, Ilyas Khan, Bob Coecke\",\"doi\":\"arxiv-2406.17583\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Artificial intelligence (AI) is currently based largely on black-box machine\\nlearning models which lack interpretability. The field of eXplainable AI (XAI)\\nstrives to address this major concern, being critical in high-stakes areas such\\nas the finance, legal and health sectors. We present an approach to defining AI models and their interpretability based\\non category theory. For this we employ the notion of a compositional model,\\nwhich sees a model in terms of formal string diagrams which capture its\\nabstract structure together with its concrete implementation. This\\ncomprehensive view incorporates deterministic, probabilistic and quantum\\nmodels. We compare a wide range of AI models as compositional models, including\\nlinear and rule-based models, (recurrent) neural networks, transformers, VAEs,\\nand causal and DisCoCirc models. Next we give a definition of interpretation of a model in terms of its\\ncompositional structure, demonstrating how to analyse the interpretability of a\\nmodel, and using this to clarify common themes in XAI. We find that what makes\\nthe standard 'intrinsically interpretable' models so transparent is brought out\\nmost clearly diagrammatically. This leads us to the more general notion of\\ncompositionally-interpretable (CI) models, which additionally include, for\\ninstance, causal, conceptual space, and DisCoCirc models. We next demonstrate the explainability benefits of CI models. Firstly, their\\ncompositional structure may allow the computation of other quantities of\\ninterest, and may facilitate inference from the model to the modelled\\nphenomenon by matching its structure. Secondly, they allow for diagrammatic\\nexplanations for their behaviour, based on influence constraints, diagram\\nsurgery and rewrite explanations. Finally, we discuss many future directions\\nfor the approach, raising the question of how to learn such meaningfully\\nstructured models in practice.\",\"PeriodicalId\":501135,\"journal\":{\"name\":\"arXiv - MATH - Category Theory\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - MATH - Category Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2406.17583\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Category Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.17583","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Artificial intelligence (AI) is currently based largely on black-box machine
learning models which lack interpretability. The field of eXplainable AI (XAI)
strives to address this major concern, being critical in high-stakes areas such
as the finance, legal and health sectors. We present an approach to defining AI models and their interpretability based
on category theory. For this we employ the notion of a compositional model,
which sees a model in terms of formal string diagrams which capture its
abstract structure together with its concrete implementation. This
comprehensive view incorporates deterministic, probabilistic and quantum
models. We compare a wide range of AI models as compositional models, including
linear and rule-based models, (recurrent) neural networks, transformers, VAEs,
and causal and DisCoCirc models. Next we give a definition of interpretation of a model in terms of its
compositional structure, demonstrating how to analyse the interpretability of a
model, and using this to clarify common themes in XAI. We find that what makes
the standard 'intrinsically interpretable' models so transparent is brought out
most clearly diagrammatically. This leads us to the more general notion of
compositionally-interpretable (CI) models, which additionally include, for
instance, causal, conceptual space, and DisCoCirc models. We next demonstrate the explainability benefits of CI models. Firstly, their
compositional structure may allow the computation of other quantities of
interest, and may facilitate inference from the model to the modelled
phenomenon by matching its structure. Secondly, they allow for diagrammatic
explanations for their behaviour, based on influence constraints, diagram
surgery and rewrite explanations. Finally, we discuss many future directions
for the approach, raising the question of how to learn such meaningfully
structured models in practice.