{"title":"E-SegNet: E-Shaped Structure Networks for Accurate 2D and 3D Medical Image Segmentation.","authors":"Wei Wu, Xin Yang, Chenggui Yao, Ou Liu, Qi Zhao, Jianwei Shuai","doi":"10.34133/research.0869","DOIUrl":null,"url":null,"abstract":"<p><p>U-structure has become a foundational approach in medical image segmentation, consistently demonstrating strong performance across various segmentation tasks. Most current models are based on this framework, customizing encoder-decoder components to achieve higher accuracy across various segmentation challenges. However, this often comes at the cost of increased parameter counts, which inevitably limit their practicality in real-world applications. In this study, we provide an E-shaped segmentation framework that discards the traditional step-by-step resolution recovery decoding process, instead directly aggregating multi-scale features extracted by the encoder at each stage for deep cross-level integration. Additionally, we propose an innovative multi-scale large-kernel convolution (MLKConv) module, designed to enhance high-level feature representation by effectively capturing both local and global contextual information. Compared to U-structure, the proposed E-structured approach substantially reduces parameters while delivering superior performance, especially in complex segmentation tasks. Based on this structure, we develop 2 segmentation networks specifically for 2-dimensional (2D) and 3D medical images. 2D E-SegNet is evaluated on four 2D segmentation benchmark datasets (Synapse multi-organ, ACDC, Kvasir-Seg, and BUSI), while 3D E-SegNet is assessed on four 3D segmentation benchmark datasets (Synapse, ACDC, NIH Pancreas, and Lung). Experimental results demonstrate that our approach outperforms the current leading U-shaped models across multiple datasets, achieving new state-of-the-art (SOTA) performance with fewer parameters. In summary, our research introduces a novel approach to medical image segmentation, offering potential improvements and contributing to ongoing advancements in the field. Our code is publicly available on https://github.com/zhaoqi106/E-SegNet.</p>","PeriodicalId":21120,"journal":{"name":"Research","volume":"8 ","pages":"0869"},"PeriodicalIF":10.7000,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12408157/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.34133/research.0869","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Multidisciplinary","Score":null,"Total":0}
引用次数: 0
Abstract
U-structure has become a foundational approach in medical image segmentation, consistently demonstrating strong performance across various segmentation tasks. Most current models are based on this framework, customizing encoder-decoder components to achieve higher accuracy across various segmentation challenges. However, this often comes at the cost of increased parameter counts, which inevitably limit their practicality in real-world applications. In this study, we provide an E-shaped segmentation framework that discards the traditional step-by-step resolution recovery decoding process, instead directly aggregating multi-scale features extracted by the encoder at each stage for deep cross-level integration. Additionally, we propose an innovative multi-scale large-kernel convolution (MLKConv) module, designed to enhance high-level feature representation by effectively capturing both local and global contextual information. Compared to U-structure, the proposed E-structured approach substantially reduces parameters while delivering superior performance, especially in complex segmentation tasks. Based on this structure, we develop 2 segmentation networks specifically for 2-dimensional (2D) and 3D medical images. 2D E-SegNet is evaluated on four 2D segmentation benchmark datasets (Synapse multi-organ, ACDC, Kvasir-Seg, and BUSI), while 3D E-SegNet is assessed on four 3D segmentation benchmark datasets (Synapse, ACDC, NIH Pancreas, and Lung). Experimental results demonstrate that our approach outperforms the current leading U-shaped models across multiple datasets, achieving new state-of-the-art (SOTA) performance with fewer parameters. In summary, our research introduces a novel approach to medical image segmentation, offering potential improvements and contributing to ongoing advancements in the field. Our code is publicly available on https://github.com/zhaoqi106/E-SegNet.
期刊介绍:
Research serves as a global platform for academic exchange, collaboration, and technological advancements. This journal welcomes high-quality research contributions from any domain, with open arms to authors from around the globe.
Comprising fundamental research in the life and physical sciences, Research also highlights significant findings and issues in engineering and applied science. The journal proudly features original research articles, reviews, perspectives, and editorials, fostering a diverse and dynamic scholarly environment.