Chenglu Zong, Weiwei Gao, Yu Fang, Fengjuan Gao, Zuxiang Wang
{"title":"融合跨尺度关注与空间金字塔池化的视盘杯精确分割","authors":"Chenglu Zong, Weiwei Gao, Yu Fang, Fengjuan Gao, Zuxiang Wang","doi":"10.1002/ima.70204","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>The Cup to Disc Ratio (CDR) is a valuable metric for assessing the relative size of the Optic Cup (OC) and Optic Disc (OD), playing a crucial role in glaucoma diagnosis. Accurate segmentation of the OC and OD is therefore the first step toward reliable glaucoma detection. However, precise segmentation is challenging due to the presence of blood vessels that traverse the OC and OD regions, as well as the blurred boundaries and relatively small proportions of the OC and OD. To address these challenges, Atrous Spatial Pyramid CrossFormer-U-Net (ACC-U-Net) is proposed to achieve accurate OC and OD segmentation. CrossFormer is integrated into the encoder to enhance the integrity of the OC and OD segmentation boundaries by constructing global attention mechanisms in both the horizontal and vertical directions. Additionally, an Atrous Spatial Pyramid Pooling (ASPP) head is incorporated at the end of the decoder, allowing the model to capture multi-level feature information of the OC and OD through multiple parallel dilated convolutions, which improves the segmentation accuracy of both the OC, OD, and their irregular boundaries. Finally, Cross Entropy and Dice (CD) Loss is introduced to enhance the model's focus on the OC, which solves the problem of the OC being easily overlooked by the model due to its small proportion. Ablation studies and comparative experiments are performed on three publicly available datasets. Compared to U-Net, the proposed ACC-U-Net shows significant improvements in segmentation accuracy, with mean Intersection over Union (mIoU), mean Dice, and mean Accuracy (mACC) increasing by 9.96%/2.75%/4.54%, 2.65%/2.94%/5.31%, and 5.89%/5.57%/4.21%, respectively. Moreover, the proposed model outperforms nine other models in segmentation accuracy on three datasets. Thus, ACC-U-Net accurately segments the OC and OD, thus providing precise CDR values that could assist in the diagnosis of glaucoma. Source code and pretrained models are available at: https://github.com/zong1019/segmentation-OCOD.git.</p>\n </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 5","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Integrating Cross-Scale Attention With Atrous Spatial Pyramid Pooling for Accurate Optic Disc and Cup Segmentation\",\"authors\":\"Chenglu Zong, Weiwei Gao, Yu Fang, Fengjuan Gao, Zuxiang Wang\",\"doi\":\"10.1002/ima.70204\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>The Cup to Disc Ratio (CDR) is a valuable metric for assessing the relative size of the Optic Cup (OC) and Optic Disc (OD), playing a crucial role in glaucoma diagnosis. Accurate segmentation of the OC and OD is therefore the first step toward reliable glaucoma detection. However, precise segmentation is challenging due to the presence of blood vessels that traverse the OC and OD regions, as well as the blurred boundaries and relatively small proportions of the OC and OD. To address these challenges, Atrous Spatial Pyramid CrossFormer-U-Net (ACC-U-Net) is proposed to achieve accurate OC and OD segmentation. CrossFormer is integrated into the encoder to enhance the integrity of the OC and OD segmentation boundaries by constructing global attention mechanisms in both the horizontal and vertical directions. Additionally, an Atrous Spatial Pyramid Pooling (ASPP) head is incorporated at the end of the decoder, allowing the model to capture multi-level feature information of the OC and OD through multiple parallel dilated convolutions, which improves the segmentation accuracy of both the OC, OD, and their irregular boundaries. Finally, Cross Entropy and Dice (CD) Loss is introduced to enhance the model's focus on the OC, which solves the problem of the OC being easily overlooked by the model due to its small proportion. Ablation studies and comparative experiments are performed on three publicly available datasets. Compared to U-Net, the proposed ACC-U-Net shows significant improvements in segmentation accuracy, with mean Intersection over Union (mIoU), mean Dice, and mean Accuracy (mACC) increasing by 9.96%/2.75%/4.54%, 2.65%/2.94%/5.31%, and 5.89%/5.57%/4.21%, respectively. Moreover, the proposed model outperforms nine other models in segmentation accuracy on three datasets. Thus, ACC-U-Net accurately segments the OC and OD, thus providing precise CDR values that could assist in the diagnosis of glaucoma. Source code and pretrained models are available at: https://github.com/zong1019/segmentation-OCOD.git.</p>\\n </div>\",\"PeriodicalId\":14027,\"journal\":{\"name\":\"International Journal of Imaging Systems and Technology\",\"volume\":\"35 5\",\"pages\":\"\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Imaging Systems and Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ima.70204\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Imaging Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ima.70204","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
摘要
杯盘比(CDR)是评估视杯(OC)和视盘(OD)相对大小的重要指标,在青光眼的诊断中起着至关重要的作用。因此,准确分割眼压和眼压是青光眼可靠检测的第一步。然而,由于存在穿过OC和OD区域的血管,以及OC和OD的边界模糊和相对较小的比例,精确分割是具有挑战性的。为了解决这些问题,提出了一种基于亚特劳斯空间金字塔的cross - former - u - net (ACC-U-Net)方法来实现准确的OC和OD分割。将CrossFormer集成到编码器中,通过在水平方向和垂直方向构建全局关注机制,增强OC和OD分割边界的完整性。此外,在解码器末端加入了一个阿特拉斯空间金字塔池(ASPP)头,允许模型通过多个并行扩展卷积来捕获OC和OD的多级特征信息,从而提高了OC、OD及其不规则边界的分割精度。最后,引入交叉熵和骰子损失(Cross Entropy and Dice Loss, CD),增强了模型对OC的关注,解决了OC占比小容易被模型忽略的问题。消融研究和比较实验在三个公开可用的数据集上进行。与U-Net算法相比,ACC-U-Net算法在分割精度上有显著提高,平均Intersection over Union (mIoU)、平均Dice (Dice)和平均准确率(mACC)分别提高了9.96%/2.75%/4.54%、2.65%/2.94%/5.31%和5.89%/5.57%/4.21%。此外,该模型在三个数据集上的分割精度优于其他九种模型。因此,ACC-U-Net可以准确地分割OC和OD,从而提供精确的CDR值,有助于青光眼的诊断。源代码和预训练模型可在:https://github.com/zong1019/segmentation-OCOD.git。
Integrating Cross-Scale Attention With Atrous Spatial Pyramid Pooling for Accurate Optic Disc and Cup Segmentation
The Cup to Disc Ratio (CDR) is a valuable metric for assessing the relative size of the Optic Cup (OC) and Optic Disc (OD), playing a crucial role in glaucoma diagnosis. Accurate segmentation of the OC and OD is therefore the first step toward reliable glaucoma detection. However, precise segmentation is challenging due to the presence of blood vessels that traverse the OC and OD regions, as well as the blurred boundaries and relatively small proportions of the OC and OD. To address these challenges, Atrous Spatial Pyramid CrossFormer-U-Net (ACC-U-Net) is proposed to achieve accurate OC and OD segmentation. CrossFormer is integrated into the encoder to enhance the integrity of the OC and OD segmentation boundaries by constructing global attention mechanisms in both the horizontal and vertical directions. Additionally, an Atrous Spatial Pyramid Pooling (ASPP) head is incorporated at the end of the decoder, allowing the model to capture multi-level feature information of the OC and OD through multiple parallel dilated convolutions, which improves the segmentation accuracy of both the OC, OD, and their irregular boundaries. Finally, Cross Entropy and Dice (CD) Loss is introduced to enhance the model's focus on the OC, which solves the problem of the OC being easily overlooked by the model due to its small proportion. Ablation studies and comparative experiments are performed on three publicly available datasets. Compared to U-Net, the proposed ACC-U-Net shows significant improvements in segmentation accuracy, with mean Intersection over Union (mIoU), mean Dice, and mean Accuracy (mACC) increasing by 9.96%/2.75%/4.54%, 2.65%/2.94%/5.31%, and 5.89%/5.57%/4.21%, respectively. Moreover, the proposed model outperforms nine other models in segmentation accuracy on three datasets. Thus, ACC-U-Net accurately segments the OC and OD, thus providing precise CDR values that could assist in the diagnosis of glaucoma. Source code and pretrained models are available at: https://github.com/zong1019/segmentation-OCOD.git.
期刊介绍:
The International Journal of Imaging Systems and Technology (IMA) is a forum for the exchange of ideas and results relevant to imaging systems, including imaging physics and informatics. The journal covers all imaging modalities in humans and animals.
IMA accepts technically sound and scientifically rigorous research in the interdisciplinary field of imaging, including relevant algorithmic research and hardware and software development, and their applications relevant to medical research. The journal provides a platform to publish original research in structural and functional imaging.
The journal is also open to imaging studies of the human body and on animals that describe novel diagnostic imaging and analyses methods. Technical, theoretical, and clinical research in both normal and clinical populations is encouraged. Submissions describing methods, software, databases, replication studies as well as negative results are also considered.
The scope of the journal includes, but is not limited to, the following in the context of biomedical research:
Imaging and neuro-imaging modalities: structural MRI, functional MRI, PET, SPECT, CT, ultrasound, EEG, MEG, NIRS etc.;
Neuromodulation and brain stimulation techniques such as TMS and tDCS;
Software and hardware for imaging, especially related to human and animal health;
Image segmentation in normal and clinical populations;
Pattern analysis and classification using machine learning techniques;
Computational modeling and analysis;
Brain connectivity and connectomics;
Systems-level characterization of brain function;
Neural networks and neurorobotics;
Computer vision, based on human/animal physiology;
Brain-computer interface (BCI) technology;
Big data, databasing and data mining.