基于光学相干断层扫描图像的金字塔视觉图卷积网络视网膜疾病分类。

IF 3.2 2区医学 Q2 BIOCHEMICAL RESEARCH METHODS

Biomedical optics express Pub Date : 2025-05-13 eCollection Date: 2025-06-01 DOI:10.1364/BOE.558731

Jin Qian, Lei Tao, Changhao Gong, Jun Xu, Yuemei Luo

{"title":"基于光学相干断层扫描图像的金字塔视觉图卷积网络视网膜疾病分类。","authors":"Jin Qian, Lei Tao, Changhao Gong, Jun Xu, Yuemei Luo","doi":"10.1364/BOE.558731","DOIUrl":null,"url":null,"abstract":"Recent advancements have seen a significant focus on using deep neural networks for classifying retinal diseases in optical coherence tomography (OCT) images. However, traditional deep neural networks treat images as grid or sequential structures, limiting their flexibility in capturing irregular and complex objects, resulting in suboptimal performance in practical applications. To address this issue, we propose a novel visual neural network model with a pyramid structure, called pyramid vision graph convolutional networks (PVGCN). This model enhances the correlations between structures by segmenting images into multiple nodes and connecting the nearest nodes. Specifically, it consists of two core components: 1) vision graph block and 2) pyramid structure. The vision graph block, composed of a grapher block and a feed-forward network (FFN), uses graph convolution methods to divide the image into multiple regions, treating them as nodes and representing the image as graph data. The graph constructed based on nodes can capture relationships between nodes without positional restrictions, better representing the irregular structure of retinal tissue. The FFN module improves the over-smoothing phenomenon in the grapher stage, enabling more accurate classification. The pyramid structure decomposes OCT images into a series of sub-images at different scales, integrating features at different scales to obtain a comprehensive feature representation of retinal hierarchical structure information. This structure can replace the extraction of higher-dimensional features in a large model by integrating features at different scales, significantly reducing the number of parameters. We conducted extensive experiments on two different datasets. The experimental results show that the proposed PVGCN achieved accuracies of 0.9954 and 0.9787 on the two datasets, respectively, surpassing existing methods. Additionally, the model demonstrated recognition capabilities comparable to those of human experts in the experiments, effectively identifying retinal diseases in OCT images.","PeriodicalId":8969,"journal":{"name":"Biomedical optics express","volume":"16 6","pages":"2312-2326"},"PeriodicalIF":3.2000,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12265600/pdf/","citationCount":"0","resultStr":"{\"title\":\"Classifying retinal diseases via pyramid vision graph convolutional network for optical coherence tomography images.\",\"authors\":\"Jin Qian, Lei Tao, Changhao Gong, Jun Xu, Yuemei Luo\",\"doi\":\"10.1364/BOE.558731\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent advancements have seen a significant focus on using deep neural networks for classifying retinal diseases in optical coherence tomography (OCT) images. However, traditional deep neural networks treat images as grid or sequential structures, limiting their flexibility in capturing irregular and complex objects, resulting in suboptimal performance in practical applications. To address this issue, we propose a novel visual neural network model with a pyramid structure, called pyramid vision graph convolutional networks (PVGCN). This model enhances the correlations between structures by segmenting images into multiple nodes and connecting the nearest nodes. Specifically, it consists of two core components: 1) vision graph block and 2) pyramid structure. The vision graph block, composed of a grapher block and a feed-forward network (FFN), uses graph convolution methods to divide the image into multiple regions, treating them as nodes and representing the image as graph data. The graph constructed based on nodes can capture relationships between nodes without positional restrictions, better representing the irregular structure of retinal tissue. The FFN module improves the over-smoothing phenomenon in the grapher stage, enabling more accurate classification. The pyramid structure decomposes OCT images into a series of sub-images at different scales, integrating features at different scales to obtain a comprehensive feature representation of retinal hierarchical structure information. This structure can replace the extraction of higher-dimensional features in a large model by integrating features at different scales, significantly reducing the number of parameters. We conducted extensive experiments on two different datasets. The experimental results show that the proposed PVGCN achieved accuracies of 0.9954 and 0.9787 on the two datasets, respectively, surpassing existing methods. Additionally, the model demonstrated recognition capabilities comparable to those of human experts in the experiments, effectively identifying retinal diseases in OCT images.\",\"PeriodicalId\":8969,\"journal\":{\"name\":\"Biomedical optics express\",\"volume\":\"16 6\",\"pages\":\"2312-2326\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12265600/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biomedical optics express\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1364/BOE.558731\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/6/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical optics express","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1364/BOE.558731","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

最近的进展是使用深度神经网络对光学相干断层扫描（OCT）图像中的视网膜疾病进行分类。然而，传统的深度神经网络将图像视为网格或顺序结构，限制了其捕获不规则和复杂物体的灵活性，导致其在实际应用中的性能欠佳。为了解决这个问题，我们提出了一种具有金字塔结构的新型视觉神经网络模型，称为金字塔视觉图卷积网络（PVGCN）。该模型通过将图像分割成多个节点并连接最近的节点来增强结构之间的相关性。具体来说，它由两个核心部分组成：1)视觉图块和2)金字塔结构。视觉图块由图块和前馈网络（FFN）组成，使用图卷积方法将图像划分为多个区域，将其作为节点，并将图像表示为图数据。基于节点构建的图可以不受位置限制地捕获节点之间的关系，更好地表示视网膜组织的不规则结构。FFN模块改善了绘图阶段的过度平滑现象，使分类更加准确。金字塔结构将OCT图像分解成一系列不同尺度的子图像，整合不同尺度的特征，获得视网膜分层结构信息的综合特征表示。这种结构可以通过整合不同尺度的特征来代替大型模型中高维特征的提取，大大减少了参数的数量。我们在两个不同的数据集上进行了广泛的实验。实验结果表明，本文提出的PVGCN在两个数据集上的准确率分别达到了0.9954和0.9787，超过了现有的方法。此外，该模型在实验中显示出与人类专家相当的识别能力，可以有效地识别OCT图像中的视网膜疾病。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Classifying retinal diseases via pyramid vision graph convolutional network for optical coherence tomography images.

Recent advancements have seen a significant focus on using deep neural networks for classifying retinal diseases in optical coherence tomography (OCT) images. However, traditional deep neural networks treat images as grid or sequential structures, limiting their flexibility in capturing irregular and complex objects, resulting in suboptimal performance in practical applications. To address this issue, we propose a novel visual neural network model with a pyramid structure, called pyramid vision graph convolutional networks (PVGCN). This model enhances the correlations between structures by segmenting images into multiple nodes and connecting the nearest nodes. Specifically, it consists of two core components: 1) vision graph block and 2) pyramid structure. The vision graph block, composed of a grapher block and a feed-forward network (FFN), uses graph convolution methods to divide the image into multiple regions, treating them as nodes and representing the image as graph data. The graph constructed based on nodes can capture relationships between nodes without positional restrictions, better representing the irregular structure of retinal tissue. The FFN module improves the over-smoothing phenomenon in the grapher stage, enabling more accurate classification. The pyramid structure decomposes OCT images into a series of sub-images at different scales, integrating features at different scales to obtain a comprehensive feature representation of retinal hierarchical structure information. This structure can replace the extraction of higher-dimensional features in a large model by integrating features at different scales, significantly reducing the number of parameters. We conducted extensive experiments on two different datasets. The experimental results show that the proposed PVGCN achieved accuracies of 0.9954 and 0.9787 on the two datasets, respectively, surpassing existing methods. Additionally, the model demonstrated recognition capabilities comparable to those of human experts in the experiments, effectively identifying retinal diseases in OCT images.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Biomedical optics express BIOCHEMICAL RESEARCH METHODS-OPTICS

CiteScore

6.80

自引率

11.80%

发文量

633

审稿时长

1 months

期刊介绍： The journal''s scope encompasses fundamental research, technology development, biomedical studies and clinical applications. BOEx focuses on the leading edge topics in the field, including: Tissue optics and spectroscopy Novel microscopies Optical coherence tomography Diffuse and fluorescence tomography Photoacoustic and multimodal imaging Molecular imaging and therapies Nanophotonic biosensing Optical biophysics/photobiology Microfluidic optical devices Vision research.