Debarpan Bhattacharya, Amir H. Poorjam, Deepak Mittal, Sriram Ganapathy
{"title":"利用蒸馏辅助可学习方法实现无梯度的事后可解释性","authors":"Debarpan Bhattacharya, Amir H. Poorjam, Deepak Mittal, Sriram Ganapathy","doi":"arxiv-2409.11123","DOIUrl":null,"url":null,"abstract":"The recent advancements in artificial intelligence (AI), with the release of\nseveral large models having only query access, make a strong case for\nexplainability of deep models in a post-hoc gradient free manner. In this\npaper, we propose a framework, named distillation aided explainability (DAX),\nthat attempts to generate a saliency-based explanation in a model agnostic\ngradient free application. The DAX approach poses the problem of explanation in\na learnable setting with a mask generation network and a distillation network.\nThe mask generation network learns to generate the multiplier mask that finds\nthe salient regions of the input, while the student distillation network aims\nto approximate the local behavior of the black-box model. We propose a joint\noptimization of the two networks in the DAX framework using the locally\nperturbed input samples, with the targets derived from input-output access to\nthe black-box model. We extensively evaluate DAX across different modalities\n(image and audio), in a classification setting, using a diverse set of\nevaluations (intersection over union with ground truth, deletion based and\nsubjective human evaluation based measures) and benchmark it with respect to\n$9$ different methods. In these evaluations, the DAX significantly outperforms\nthe existing approaches on all modalities and evaluation metrics.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Gradient-free Post-hoc Explainability Using Distillation Aided Learnable Approach\",\"authors\":\"Debarpan Bhattacharya, Amir H. Poorjam, Deepak Mittal, Sriram Ganapathy\",\"doi\":\"arxiv-2409.11123\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The recent advancements in artificial intelligence (AI), with the release of\\nseveral large models having only query access, make a strong case for\\nexplainability of deep models in a post-hoc gradient free manner. In this\\npaper, we propose a framework, named distillation aided explainability (DAX),\\nthat attempts to generate a saliency-based explanation in a model agnostic\\ngradient free application. The DAX approach poses the problem of explanation in\\na learnable setting with a mask generation network and a distillation network.\\nThe mask generation network learns to generate the multiplier mask that finds\\nthe salient regions of the input, while the student distillation network aims\\nto approximate the local behavior of the black-box model. We propose a joint\\noptimization of the two networks in the DAX framework using the locally\\nperturbed input samples, with the targets derived from input-output access to\\nthe black-box model. We extensively evaluate DAX across different modalities\\n(image and audio), in a classification setting, using a diverse set of\\nevaluations (intersection over union with ground truth, deletion based and\\nsubjective human evaluation based measures) and benchmark it with respect to\\n$9$ different methods. In these evaluations, the DAX significantly outperforms\\nthe existing approaches on all modalities and evaluation metrics.\",\"PeriodicalId\":501289,\"journal\":{\"name\":\"arXiv - EE - Image and Video Processing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - EE - Image and Video Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11123\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11123","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Gradient-free Post-hoc Explainability Using Distillation Aided Learnable Approach
The recent advancements in artificial intelligence (AI), with the release of
several large models having only query access, make a strong case for
explainability of deep models in a post-hoc gradient free manner. In this
paper, we propose a framework, named distillation aided explainability (DAX),
that attempts to generate a saliency-based explanation in a model agnostic
gradient free application. The DAX approach poses the problem of explanation in
a learnable setting with a mask generation network and a distillation network.
The mask generation network learns to generate the multiplier mask that finds
the salient regions of the input, while the student distillation network aims
to approximate the local behavior of the black-box model. We propose a joint
optimization of the two networks in the DAX framework using the locally
perturbed input samples, with the targets derived from input-output access to
the black-box model. We extensively evaluate DAX across different modalities
(image and audio), in a classification setting, using a diverse set of
evaluations (intersection over union with ground truth, deletion based and
subjective human evaluation based measures) and benchmark it with respect to
$9$ different methods. In these evaluations, the DAX significantly outperforms
the existing approaches on all modalities and evaluation metrics.