Feng Li , Hao Wei , Xinyu Sheng , Yuyang Chen , Haidong Zou , Song Huang
{"title":"Global-Local Transformer Network for Automatic Retinal Pathological Fluid Segmentation in Optical Coherence Tomography Images","authors":"Feng Li , Hao Wei , Xinyu Sheng , Yuyang Chen , Haidong Zou , Song Huang","doi":"10.1016/j.cmpb.2025.108772","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective</h3><div>As a pivotal biomarker, the accurate segmentation of retinal pathological fluid such as intraretinal fluid (IRF), subretinal fluid (SRF), and pigment epithelial detachment (PED), was a critical task for diagnosis and treatment management in various retinopathy. However, segmenting pathological fluids from optical coherence tomography (OCT) images still faced several challenges, including large variations in location, size and shape, low intensity contrast between fluids and peripheral tissues, speckle noise interference, and high similarity between fluid and background. Further, owing to the intrinsic local nature of convolution operations, most automatic retinal fluid segmentation approaches built upon deep convolutional neural network had limited capacity in capturing pathological features with global dependencies, prone to deviations. Accordingly, it was of great significance to develop automatic methods for accurate segmentation and quantitative analysis on multi-type retinal fluids in OCT images.</div></div><div><h3>Methods</h3><div>In this paper, we developed a novelty global-local Transformer network (GLTNet) based on U-shape architecture for simultaneously segmenting multiple types of pathological fluids from retinal OCT images. In our GLTNet, we designed a global-local attention module (GLAM) and aggregated it into the VGG-19 backbone to learn more pathological fluid related discriminative feature representations and suppress irrelevant noise information in OCT images. At the same time, we constructed multi-scale Transformer module (MSTM) on top of the encoder pathway to explore various scales of non-local characteristics with long-term dependency information from multiple layers of encoder part. By integrating both blocks for serving as a strong encoder of U-Net, our network improved the model's ability to capture finer details, thereby enabling precise segmentation of multi-type retinal fluids within OCT images.</div></div><div><h3>Results</h3><div>We evaluated the segmentation performance of the presented GLTNet on Kermany, DUKE and UMN datasets. Comprehensive experimental results on Kermany dataset showed that our model achieved overall 0.8395, 0.7657, 0.8631, and 0.8202, on the Dice coefficient, IoU, Sensitivity and precision, respectively, which remarkably outperformed other state-of-the-art retinal fluid segmentation approaches. The experimental results on DUKE and UMN datasets suggested our model had satisfactory generalizability.</div></div><div><h3>Conclusions</h3><div>By comparison with current cutting-edge methods, the developed GLTNet gained a significantly boost in retinal fluid segmentation performance, manifested good generalization and robustness, which had a great potential of assisting ophthalmologists in diagnosing diversity of eye disorders and developing as-needed therapy regiments.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"266 ","pages":"Article 108772"},"PeriodicalIF":4.9000,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260725001890","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Background and Objective
As a pivotal biomarker, the accurate segmentation of retinal pathological fluid such as intraretinal fluid (IRF), subretinal fluid (SRF), and pigment epithelial detachment (PED), was a critical task for diagnosis and treatment management in various retinopathy. However, segmenting pathological fluids from optical coherence tomography (OCT) images still faced several challenges, including large variations in location, size and shape, low intensity contrast between fluids and peripheral tissues, speckle noise interference, and high similarity between fluid and background. Further, owing to the intrinsic local nature of convolution operations, most automatic retinal fluid segmentation approaches built upon deep convolutional neural network had limited capacity in capturing pathological features with global dependencies, prone to deviations. Accordingly, it was of great significance to develop automatic methods for accurate segmentation and quantitative analysis on multi-type retinal fluids in OCT images.
Methods
In this paper, we developed a novelty global-local Transformer network (GLTNet) based on U-shape architecture for simultaneously segmenting multiple types of pathological fluids from retinal OCT images. In our GLTNet, we designed a global-local attention module (GLAM) and aggregated it into the VGG-19 backbone to learn more pathological fluid related discriminative feature representations and suppress irrelevant noise information in OCT images. At the same time, we constructed multi-scale Transformer module (MSTM) on top of the encoder pathway to explore various scales of non-local characteristics with long-term dependency information from multiple layers of encoder part. By integrating both blocks for serving as a strong encoder of U-Net, our network improved the model's ability to capture finer details, thereby enabling precise segmentation of multi-type retinal fluids within OCT images.
Results
We evaluated the segmentation performance of the presented GLTNet on Kermany, DUKE and UMN datasets. Comprehensive experimental results on Kermany dataset showed that our model achieved overall 0.8395, 0.7657, 0.8631, and 0.8202, on the Dice coefficient, IoU, Sensitivity and precision, respectively, which remarkably outperformed other state-of-the-art retinal fluid segmentation approaches. The experimental results on DUKE and UMN datasets suggested our model had satisfactory generalizability.
Conclusions
By comparison with current cutting-edge methods, the developed GLTNet gained a significantly boost in retinal fluid segmentation performance, manifested good generalization and robustness, which had a great potential of assisting ophthalmologists in diagnosing diversity of eye disorders and developing as-needed therapy regiments.
期刊介绍:
To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine.
Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.