Ying Zhang, Rencan Nie, Jinde Cao, Chaozhen Ma, Chengchao Wang
{"title":"SS-SSAN:多模态医学图像融合的自监督子空间注意网络","authors":"Ying Zhang, Rencan Nie, Jinde Cao, Chaozhen Ma, Chengchao Wang","doi":"10.1007/s10462-023-10529-w","DOIUrl":null,"url":null,"abstract":"<div><p>Multi-modal medical image fusion (MMIF) is used to merge multiple modes of medical images for better imaging quality and more comprehensive information, such that enhancing the reliability of clinical diagnosis. Since different types of medical images have different imaging mechanisms and focus on different pathological tissues, how to accurately fuse the information from various medical images has become an obstacle in image fusion research. In this paper, we propose a self-supervised subspace attentional framework for multi-modal image fusion, which is constructed by two sub-networks, i.e., the feature extract network and the feature fusion network. We implement a self-supervised strategy that facilitates the framework adaptively extracts the features of source images with the reconstruction of the fused image. Specifically, we adopt a subspace attentional Siamese Weighted Auto-Encoder as a feature extractor to extract the source image features including local and global features at first. Then, the extracted features are given into a weighted fusion decoding network to reconstruct the fused result, and the shallow features from the extractor are used to assist reconstruct the fused image. Finally, the feature extractor adaptively extracts the optimal features according to the fused results by simultaneously training the two sub-networks. Furthermore, to achieve better fusion results, we design a novel weight estimation in the weighted fidelity loss that measures the importance of each pixel by calculating a mixture of salient features and local contrast features of the image. Experiments demonstrate that our method gives the best results compared with other state-of-the-art fusion approaches.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"56 1","pages":"421 - 443"},"PeriodicalIF":10.7000,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SS-SSAN: a self-supervised subspace attentional network for multi-modal medical image fusion\",\"authors\":\"Ying Zhang, Rencan Nie, Jinde Cao, Chaozhen Ma, Chengchao Wang\",\"doi\":\"10.1007/s10462-023-10529-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Multi-modal medical image fusion (MMIF) is used to merge multiple modes of medical images for better imaging quality and more comprehensive information, such that enhancing the reliability of clinical diagnosis. Since different types of medical images have different imaging mechanisms and focus on different pathological tissues, how to accurately fuse the information from various medical images has become an obstacle in image fusion research. In this paper, we propose a self-supervised subspace attentional framework for multi-modal image fusion, which is constructed by two sub-networks, i.e., the feature extract network and the feature fusion network. We implement a self-supervised strategy that facilitates the framework adaptively extracts the features of source images with the reconstruction of the fused image. Specifically, we adopt a subspace attentional Siamese Weighted Auto-Encoder as a feature extractor to extract the source image features including local and global features at first. Then, the extracted features are given into a weighted fusion decoding network to reconstruct the fused result, and the shallow features from the extractor are used to assist reconstruct the fused image. Finally, the feature extractor adaptively extracts the optimal features according to the fused results by simultaneously training the two sub-networks. Furthermore, to achieve better fusion results, we design a novel weight estimation in the weighted fidelity loss that measures the importance of each pixel by calculating a mixture of salient features and local contrast features of the image. Experiments demonstrate that our method gives the best results compared with other state-of-the-art fusion approaches.</p></div>\",\"PeriodicalId\":8449,\"journal\":{\"name\":\"Artificial Intelligence Review\",\"volume\":\"56 1\",\"pages\":\"421 - 443\"},\"PeriodicalIF\":10.7000,\"publicationDate\":\"2023-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence Review\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10462-023-10529-w\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-023-10529-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
SS-SSAN: a self-supervised subspace attentional network for multi-modal medical image fusion
Multi-modal medical image fusion (MMIF) is used to merge multiple modes of medical images for better imaging quality and more comprehensive information, such that enhancing the reliability of clinical diagnosis. Since different types of medical images have different imaging mechanisms and focus on different pathological tissues, how to accurately fuse the information from various medical images has become an obstacle in image fusion research. In this paper, we propose a self-supervised subspace attentional framework for multi-modal image fusion, which is constructed by two sub-networks, i.e., the feature extract network and the feature fusion network. We implement a self-supervised strategy that facilitates the framework adaptively extracts the features of source images with the reconstruction of the fused image. Specifically, we adopt a subspace attentional Siamese Weighted Auto-Encoder as a feature extractor to extract the source image features including local and global features at first. Then, the extracted features are given into a weighted fusion decoding network to reconstruct the fused result, and the shallow features from the extractor are used to assist reconstruct the fused image. Finally, the feature extractor adaptively extracts the optimal features according to the fused results by simultaneously training the two sub-networks. Furthermore, to achieve better fusion results, we design a novel weight estimation in the weighted fidelity loss that measures the importance of each pixel by calculating a mixture of salient features and local contrast features of the image. Experiments demonstrate that our method gives the best results compared with other state-of-the-art fusion approaches.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.