Pengfei Cai, Biyuan Li, Jinying Ma, Xiao Tian, Jun Yan
{"title":"Global–Local Hybrid Modulation Network for Retinal Vessel and Coronary Angiograph Segmentation","authors":"Pengfei Cai, Biyuan Li, Jinying Ma, Xiao Tian, Jun Yan","doi":"10.1007/s42235-025-00727-3","DOIUrl":null,"url":null,"abstract":"<div><p>The segmentation of retinal vessels and coronary angiographs is essential for diagnosing conditions such as glaucoma, diabetes, hypertension, and coronary artery disease. However, retinal vessels and coronary angiographs are characterized by low contrast and complex structures, posing challenges for vessel segmentation. Moreover, CNN-based approaches are limited in capturing long-range pixel relationships due to their focus on local feature extraction, while ViT-based approaches struggle to capture fine local details, impacting tasks like vessel segmentation that require precise boundary detection. To address these issues, in this paper, we propose a Global–Local Hybrid Modulation Network (GLHM-Net), a dual-encoder architecture that combines the strengths of CNNs and ViTs for vessel segmentation. First, the Hybrid Non-Local Transformer Block (HNLTB) is proposed to efficiently consolidate long-range spatial dependencies into a compact feature representation, providing a global perspective while significantly reducing computational overhead. Second, the Collaborative Attention Fusion Block (CAFB) is proposed to more effectively integrate local and global vessel features at the same hierarchical level during the encoding phase. Finally, the proposed Feature Cross-Modulation Block (FCMB) better complements the local and global features in the decoding stage, effectively enhancing feature learning and minimizing information loss. The experiments conducted on the DRIVE, CHASEDB1, DCA1, and XCAD datasets, achieving AUC values of 0.9811, 0.9864, 0.9915, and 0.9919, F1 scores of 0.8288, 0.8202, 0.8040, and 0.8150, and IOU values of 0.7076, 0.6952, 0.6723, and 0.6878, respectively, demonstrate the strong performance of our proposed network for vessel segmentation.</p></div>","PeriodicalId":614,"journal":{"name":"Journal of Bionic Engineering","volume":"22 4","pages":"2050 - 2074"},"PeriodicalIF":5.8000,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Bionic Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s42235-025-00727-3","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
The segmentation of retinal vessels and coronary angiographs is essential for diagnosing conditions such as glaucoma, diabetes, hypertension, and coronary artery disease. However, retinal vessels and coronary angiographs are characterized by low contrast and complex structures, posing challenges for vessel segmentation. Moreover, CNN-based approaches are limited in capturing long-range pixel relationships due to their focus on local feature extraction, while ViT-based approaches struggle to capture fine local details, impacting tasks like vessel segmentation that require precise boundary detection. To address these issues, in this paper, we propose a Global–Local Hybrid Modulation Network (GLHM-Net), a dual-encoder architecture that combines the strengths of CNNs and ViTs for vessel segmentation. First, the Hybrid Non-Local Transformer Block (HNLTB) is proposed to efficiently consolidate long-range spatial dependencies into a compact feature representation, providing a global perspective while significantly reducing computational overhead. Second, the Collaborative Attention Fusion Block (CAFB) is proposed to more effectively integrate local and global vessel features at the same hierarchical level during the encoding phase. Finally, the proposed Feature Cross-Modulation Block (FCMB) better complements the local and global features in the decoding stage, effectively enhancing feature learning and minimizing information loss. The experiments conducted on the DRIVE, CHASEDB1, DCA1, and XCAD datasets, achieving AUC values of 0.9811, 0.9864, 0.9915, and 0.9919, F1 scores of 0.8288, 0.8202, 0.8040, and 0.8150, and IOU values of 0.7076, 0.6952, 0.6723, and 0.6878, respectively, demonstrate the strong performance of our proposed network for vessel segmentation.
期刊介绍:
The Journal of Bionic Engineering (JBE) is a peer-reviewed journal that publishes original research papers and reviews that apply the knowledge learned from nature and biological systems to solve concrete engineering problems. The topics that JBE covers include but are not limited to:
Mechanisms, kinematical mechanics and control of animal locomotion, development of mobile robots with walking (running and crawling), swimming or flying abilities inspired by animal locomotion.
Structures, morphologies, composition and physical properties of natural and biomaterials; fabrication of new materials mimicking the properties and functions of natural and biomaterials.
Biomedical materials, artificial organs and tissue engineering for medical applications; rehabilitation equipment and devices.
Development of bioinspired computation methods and artificial intelligence for engineering applications.