Imran Hossain, Ghada Zamzmi, Peter Mouton, Yu Sun, Dmitry Goldgof
{"title":"Enhancing Concept-Based Explanation with Vision-Language Models.","authors":"Imran Hossain, Ghada Zamzmi, Peter Mouton, Yu Sun, Dmitry Goldgof","doi":"10.1109/CBMS61543.2024.00044","DOIUrl":null,"url":null,"abstract":"<p><p>Although concept-based approaches are widely used to explain a model's behavior and assess the contributions of different concepts in decision-making, identifying relevant concepts can be challenging for non-experts. This paper introduces a novel method that simplifies concept selection by leveraging the capabilities of a state-of-the-art large Vision-Language Model (VLM). Our method employs a VLM to select textual concepts that describe the classes in the target dataset. We then transform these influential textual concepts into human-readable image concepts using a text-to-image model. This process allows us to explain the targeted network in a post-hoc manner. Further, we use directional derivatives and concept activation vectors to quantify the importance of the generated concepts. We evaluate our method on a neonatal pain classification task, analyzing the sensitivity of the model's output for the generated concepts. The results demonstrate that the VLM not only generates coherent and meaningful concepts that are easily understandable by non-experts but also achieves performance comparable to that of natural image concepts without the need for additional annotation costs.</p>","PeriodicalId":74567,"journal":{"name":"Proceedings. IEEE International Symposium on Computer-Based Medical Systems","volume":"2024 ","pages":"219-224"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12458896/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Symposium on Computer-Based Medical Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBMS61543.2024.00044","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/25 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Although concept-based approaches are widely used to explain a model's behavior and assess the contributions of different concepts in decision-making, identifying relevant concepts can be challenging for non-experts. This paper introduces a novel method that simplifies concept selection by leveraging the capabilities of a state-of-the-art large Vision-Language Model (VLM). Our method employs a VLM to select textual concepts that describe the classes in the target dataset. We then transform these influential textual concepts into human-readable image concepts using a text-to-image model. This process allows us to explain the targeted network in a post-hoc manner. Further, we use directional derivatives and concept activation vectors to quantify the importance of the generated concepts. We evaluate our method on a neonatal pain classification task, analyzing the sensitivity of the model's output for the generated concepts. The results demonstrate that the VLM not only generates coherent and meaningful concepts that are easily understandable by non-experts but also achieves performance comparable to that of natural image concepts without the need for additional annotation costs.