Global-Local Transformer Network for Automatic Retinal Pathological Fluid Segmentation in Optical Coherence Tomography Images

IF 4.9 2区医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computer methods and programs in biomedicine Pub Date : 2025-04-10 DOI:10.1016/j.cmpb.2025.108772

Feng Li , Hao Wei , Xinyu Sheng , Yuyang Chen , Haidong Zou , Song Huang

{"title":"Global-Local Transformer Network for Automatic Retinal Pathological Fluid Segmentation in Optical Coherence Tomography Images","authors":"Feng Li , Hao Wei , Xinyu Sheng , Yuyang Chen , Haidong Zou , Song Huang","doi":"10.1016/j.cmpb.2025.108772","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective</h3><div>As a pivotal biomarker, the accurate segmentation of retinal pathological fluid such as intraretinal fluid (IRF), subretinal fluid (SRF), and pigment epithelial detachment (PED), was a critical task for diagnosis and treatment management in various retinopathy. However, segmenting pathological fluids from optical coherence tomography (OCT) images still faced several challenges, including large variations in location, size and shape, low intensity contrast between fluids and peripheral tissues, speckle noise interference, and high similarity between fluid and background. Further, owing to the intrinsic local nature of convolution operations, most automatic retinal fluid segmentation approaches built upon deep convolutional neural network had limited capacity in capturing pathological features with global dependencies, prone to deviations. Accordingly, it was of great significance to develop automatic methods for accurate segmentation and quantitative analysis on multi-type retinal fluids in OCT images.</div></div><div><h3>Methods</h3><div>In this paper, we developed a novelty global-local Transformer network (GLTNet) based on U-shape architecture for simultaneously segmenting multiple types of pathological fluids from retinal OCT images. In our GLTNet, we designed a global-local attention module (GLAM) and aggregated it into the VGG-19 backbone to learn more pathological fluid related discriminative feature representations and suppress irrelevant noise information in OCT images. At the same time, we constructed multi-scale Transformer module (MSTM) on top of the encoder pathway to explore various scales of non-local characteristics with long-term dependency information from multiple layers of encoder part. By integrating both blocks for serving as a strong encoder of U-Net, our network improved the model's ability to capture finer details, thereby enabling precise segmentation of multi-type retinal fluids within OCT images.</div></div><div><h3>Results</h3><div>We evaluated the segmentation performance of the presented GLTNet on Kermany, DUKE and UMN datasets. Comprehensive experimental results on Kermany dataset showed that our model achieved overall 0.8395, 0.7657, 0.8631, and 0.8202, on the Dice coefficient, IoU, Sensitivity and precision, respectively, which remarkably outperformed other state-of-the-art retinal fluid segmentation approaches. The experimental results on DUKE and UMN datasets suggested our model had satisfactory generalizability.</div></div><div><h3>Conclusions</h3><div>By comparison with current cutting-edge methods, the developed GLTNet gained a significantly boost in retinal fluid segmentation performance, manifested good generalization and robustness, which had a great potential of assisting ophthalmologists in diagnosing diversity of eye disorders and developing as-needed therapy regiments.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"266 ","pages":"Article 108772"},"PeriodicalIF":4.9000,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260725001890","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Background and Objective

As a pivotal biomarker, the accurate segmentation of retinal pathological fluid such as intraretinal fluid (IRF), subretinal fluid (SRF), and pigment epithelial detachment (PED), was a critical task for diagnosis and treatment management in various retinopathy. However, segmenting pathological fluids from optical coherence tomography (OCT) images still faced several challenges, including large variations in location, size and shape, low intensity contrast between fluids and peripheral tissues, speckle noise interference, and high similarity between fluid and background. Further, owing to the intrinsic local nature of convolution operations, most automatic retinal fluid segmentation approaches built upon deep convolutional neural network had limited capacity in capturing pathological features with global dependencies, prone to deviations. Accordingly, it was of great significance to develop automatic methods for accurate segmentation and quantitative analysis on multi-type retinal fluids in OCT images.

Methods

In this paper, we developed a novelty global-local Transformer network (GLTNet) based on U-shape architecture for simultaneously segmenting multiple types of pathological fluids from retinal OCT images. In our GLTNet, we designed a global-local attention module (GLAM) and aggregated it into the VGG-19 backbone to learn more pathological fluid related discriminative feature representations and suppress irrelevant noise information in OCT images. At the same time, we constructed multi-scale Transformer module (MSTM) on top of the encoder pathway to explore various scales of non-local characteristics with long-term dependency information from multiple layers of encoder part. By integrating both blocks for serving as a strong encoder of U-Net, our network improved the model's ability to capture finer details, thereby enabling precise segmentation of multi-type retinal fluids within OCT images.

Results

We evaluated the segmentation performance of the presented GLTNet on Kermany, DUKE and UMN datasets. Comprehensive experimental results on Kermany dataset showed that our model achieved overall 0.8395, 0.7657, 0.8631, and 0.8202, on the Dice coefficient, IoU, Sensitivity and precision, respectively, which remarkably outperformed other state-of-the-art retinal fluid segmentation approaches. The experimental results on DUKE and UMN datasets suggested our model had satisfactory generalizability.

Conclusions

By comparison with current cutting-edge methods, the developed GLTNet gained a significantly boost in retinal fluid segmentation performance, manifested good generalization and robustness, which had a great potential of assisting ophthalmologists in diagnosing diversity of eye disorders and developing as-needed therapy regiments.

查看原文本刊更多论文

光学相干断层扫描图像中视网膜病理流体自动分割的全局-局部变压器网络

背景与目的视网膜病理液如视网膜内液（IRF）、视网膜下液（SRF）和色素上皮脱离（PED）作为一种关键的生物标志物，其准确分割是各种视网膜病变诊断和治疗管理的关键任务。然而，从光学相干断层扫描（OCT）图像中分割病理液体仍然面临着一些挑战，包括位置、大小和形状的巨大差异，液体和周围组织之间的低强度对比度，散斑噪声干扰以及流体和背景之间的高度相似性。此外，由于卷积运算固有的局域性，大多数基于深度卷积神经网络的视网膜液自动分割方法在捕获具有全局依赖性的病理特征方面能力有限，容易出现偏差。因此，开发对OCT图像中多类型视网膜液体进行准确分割和定量分析的自动方法具有重要意义。方法建立了一种基于u型结构的全局-局部变压器网络（GLTNet），用于同时分割视网膜OCT图像中多种类型的病理液体。在我们的GLTNet中，我们设计了一个全局-局部注意模块（global-local attention module， GLAM），并将其聚合到VGG-19主干中，以学习更多与病理性流体相关的判别特征表征，并抑制OCT图像中不相关的噪声信息。同时，我们在编码器路径之上构建了多尺度变压器模块（MSTM），从多层编码器部件中挖掘具有长期依赖信息的不同尺度的非局部特征。通过整合这两个块作为U-Net的强大编码器，我们的网络提高了模型捕获更精细细节的能力，从而能够在OCT图像中精确分割多种类型的视网膜液体。结果我们对所提出的GLTNet在德国、杜克和UMN数据集上的分割性能进行了评估。在Kermany数据集上的综合实验结果表明，我们的模型在Dice系数、IoU、Sensitivity和precision上分别达到0.8395、0.7657、0.8631和0.8202，显著优于其他先进的视网膜液分割方法。在DUKE和UMN数据集上的实验结果表明，我们的模型具有令人满意的通用性。结论与现有先进方法相比，GLTNet的视网膜液分割性能明显提高，具有良好的通用性和鲁棒性，在协助眼科医生诊断多种眼病和制定治疗方案方面具有很大的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer methods and programs in biomedicine 工程技术-工程：生物医学

CiteScore

12.30

自引率

6.60%

发文量

601

审稿时长

135 days

期刊介绍： To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.