{"title":"MSG-CAM:多尺度输入可以更好地对CNN网络进行视觉解读","authors":"Xiaohong Xiang, Fuyuan Zhang, Xin Deng, Ke Hu","doi":"10.1109/ICME55011.2023.00061","DOIUrl":null,"url":null,"abstract":"The visualization of deep learning models has been widely studied as an effective means of exploring the decision-making processes within these models. However, current visualization methods suffer from several limitations, such as low resolution and poor visualization of multiple occurrences of the same class. In this paper, we propose a novel visualization technique called MSG-CAM, which is an improvement on the existing Group-CAM method. Our method uses the feature maps and gradients of the last layer of the convolutional neural network to create masks through multi-scale enlargement of the original input image and fusion of the resulting feature maps and gradients. Through both qualitative and quantitative analysis, we have demonstrated that the saliency maps generated by our method are more reasonable and accurately reflect the internal decision-making processes of the neural network.","PeriodicalId":321830,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo (ICME)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MSG-CAM:Multi-scale inputs make a better visual interpretation of CNN networks\",\"authors\":\"Xiaohong Xiang, Fuyuan Zhang, Xin Deng, Ke Hu\",\"doi\":\"10.1109/ICME55011.2023.00061\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The visualization of deep learning models has been widely studied as an effective means of exploring the decision-making processes within these models. However, current visualization methods suffer from several limitations, such as low resolution and poor visualization of multiple occurrences of the same class. In this paper, we propose a novel visualization technique called MSG-CAM, which is an improvement on the existing Group-CAM method. Our method uses the feature maps and gradients of the last layer of the convolutional neural network to create masks through multi-scale enlargement of the original input image and fusion of the resulting feature maps and gradients. Through both qualitative and quantitative analysis, we have demonstrated that the saliency maps generated by our method are more reasonable and accurately reflect the internal decision-making processes of the neural network.\",\"PeriodicalId\":321830,\"journal\":{\"name\":\"2023 IEEE International Conference on Multimedia and Expo (ICME)\",\"volume\":\"89 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Multimedia and Expo (ICME)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME55011.2023.00061\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Multimedia and Expo (ICME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME55011.2023.00061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MSG-CAM:Multi-scale inputs make a better visual interpretation of CNN networks
The visualization of deep learning models has been widely studied as an effective means of exploring the decision-making processes within these models. However, current visualization methods suffer from several limitations, such as low resolution and poor visualization of multiple occurrences of the same class. In this paper, we propose a novel visualization technique called MSG-CAM, which is an improvement on the existing Group-CAM method. Our method uses the feature maps and gradients of the last layer of the convolutional neural network to create masks through multi-scale enlargement of the original input image and fusion of the resulting feature maps and gradients. Through both qualitative and quantitative analysis, we have demonstrated that the saliency maps generated by our method are more reasonable and accurately reflect the internal decision-making processes of the neural network.