Tahir Hussain , Hayaru Shouno , Mazin Abed Mohammad , Haydar Abdulameer Marhoon , Taukir Alam
{"title":"基于DenseNet通道空间和语义引导注意力的生物医学图像分割","authors":"Tahir Hussain , Hayaru Shouno , Mazin Abed Mohammad , Haydar Abdulameer Marhoon , Taukir Alam","doi":"10.1016/j.knosys.2025.113233","DOIUrl":null,"url":null,"abstract":"<div><div>Convolutional neural networks have progressed significantly in the field of biomedical image segmentation, although precision remains a challenge. The inconsistent sizes and shapes of the lesion regions make it difficult for the existing deep learning methods to extract their discriminatory features. Additionally, spatial and semantic information is not effectively merged during decoding, resulting in redundant information and semantic gaps. To address these challenges, we propose the Dense Channel Spatial Semantic Guidance Attention UNet (DCSSGA-UNet) architecture, which integrates DenseNet201 as the base encoder and attention mechanisms to enhance segmentation performance. The decoder follows the standard U-Net pipeline, with the encoder capturing global multi-scale features through dense convolutional and transition blocks, which enhance the model’s ability to distinguish between intricate details. The introduction of the channel spatial attention (CSA) and semantic guidance attention (SGA) modules selectively focuses on important features and reduces redundancy, effectively bridging semantic gaps. Tests conducted on three medical image datasets (CVC-ClinicDB, CVC-ColonDB, and Kvasir-SEG) showed that our proposed DCSSGA-UNet model could detect object variabilities with improved results and outperformed other comparable methods. It achieved the mean intersection-over-union (mIoU) scores of 95.67%, 92.39%, and 93.97%, as well as mean dice coefficient (mDice) of 98.85%, 95.71%, and 96.10%, respectively. These results highlight the model’s superior precision and exceptional versatility, making it a valuable tool for clinical applications, particularly for accurate lesion segmentation and assisting in the diagnosis and treatment of diseases like colorectal cancer.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"314 ","pages":"Article 113233"},"PeriodicalIF":7.6000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DCSSGA-UNet: Biomedical image segmentation with DenseNet channel spatial and Semantic Guidance Attention\",\"authors\":\"Tahir Hussain , Hayaru Shouno , Mazin Abed Mohammad , Haydar Abdulameer Marhoon , Taukir Alam\",\"doi\":\"10.1016/j.knosys.2025.113233\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Convolutional neural networks have progressed significantly in the field of biomedical image segmentation, although precision remains a challenge. The inconsistent sizes and shapes of the lesion regions make it difficult for the existing deep learning methods to extract their discriminatory features. Additionally, spatial and semantic information is not effectively merged during decoding, resulting in redundant information and semantic gaps. To address these challenges, we propose the Dense Channel Spatial Semantic Guidance Attention UNet (DCSSGA-UNet) architecture, which integrates DenseNet201 as the base encoder and attention mechanisms to enhance segmentation performance. The decoder follows the standard U-Net pipeline, with the encoder capturing global multi-scale features through dense convolutional and transition blocks, which enhance the model’s ability to distinguish between intricate details. The introduction of the channel spatial attention (CSA) and semantic guidance attention (SGA) modules selectively focuses on important features and reduces redundancy, effectively bridging semantic gaps. Tests conducted on three medical image datasets (CVC-ClinicDB, CVC-ColonDB, and Kvasir-SEG) showed that our proposed DCSSGA-UNet model could detect object variabilities with improved results and outperformed other comparable methods. It achieved the mean intersection-over-union (mIoU) scores of 95.67%, 92.39%, and 93.97%, as well as mean dice coefficient (mDice) of 98.85%, 95.71%, and 96.10%, respectively. These results highlight the model’s superior precision and exceptional versatility, making it a valuable tool for clinical applications, particularly for accurate lesion segmentation and assisting in the diagnosis and treatment of diseases like colorectal cancer.</div></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"314 \",\"pages\":\"Article 113233\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950705125002801\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125002801","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
DCSSGA-UNet: Biomedical image segmentation with DenseNet channel spatial and Semantic Guidance Attention
Convolutional neural networks have progressed significantly in the field of biomedical image segmentation, although precision remains a challenge. The inconsistent sizes and shapes of the lesion regions make it difficult for the existing deep learning methods to extract their discriminatory features. Additionally, spatial and semantic information is not effectively merged during decoding, resulting in redundant information and semantic gaps. To address these challenges, we propose the Dense Channel Spatial Semantic Guidance Attention UNet (DCSSGA-UNet) architecture, which integrates DenseNet201 as the base encoder and attention mechanisms to enhance segmentation performance. The decoder follows the standard U-Net pipeline, with the encoder capturing global multi-scale features through dense convolutional and transition blocks, which enhance the model’s ability to distinguish between intricate details. The introduction of the channel spatial attention (CSA) and semantic guidance attention (SGA) modules selectively focuses on important features and reduces redundancy, effectively bridging semantic gaps. Tests conducted on three medical image datasets (CVC-ClinicDB, CVC-ColonDB, and Kvasir-SEG) showed that our proposed DCSSGA-UNet model could detect object variabilities with improved results and outperformed other comparable methods. It achieved the mean intersection-over-union (mIoU) scores of 95.67%, 92.39%, and 93.97%, as well as mean dice coefficient (mDice) of 98.85%, 95.71%, and 96.10%, respectively. These results highlight the model’s superior precision and exceptional versatility, making it a valuable tool for clinical applications, particularly for accurate lesion segmentation and assisting in the diagnosis and treatment of diseases like colorectal cancer.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.