Glaucoformer: Dual-domain Global Transformer Network for Generalized Glaucoma Stage Classification.

IF 6.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Journal of Biomedical and Health Informatics Pub Date : 2025-05-29 DOI:10.1109/JBHI.2025.3574997

Dipankar Das, Deepak Ranjan Nayak, Ram Bilas Pachori

{"title":"Glaucoformer: Dual-domain Global Transformer Network for Generalized Glaucoma Stage Classification.","authors":"Dipankar Das, Deepak Ranjan Nayak, Ram Bilas Pachori","doi":"10.1109/JBHI.2025.3574997","DOIUrl":null,"url":null,"abstract":"<p><p>Classification of glaucoma stages remains challenging due to substantial inter-stage similarities, the presence of irrelevant features, and subtle lesion size, shape, and color variations in fundus images. For this purpose, few efforts have recently been made using traditional machine learning and deep learning models, specifically convolutional neural networks (CNN). While the conventional CNN models capture local contextual features within fixed receptive fields, they fail to exploit global contextual dependencies. Transformers, on the other hand, are capable of modeling global contextual information. However, they lack the ability to capture local contexts and merely focus on performing attention in the spatial domain, ignoring feature analysis in the frequency domain. To address these issues, we present a novel dual-domain global transformer network, Glaucoformer, to effectively classify glaucoma stages. Specifically, we propose a dual-domain global transformer layer (DGTL) consisting of dual-domain channel attention (DCA) and dual-domain spatial attention (DSA) with Fourier domain feature analyzer (FDFA) as the core component and integrated with a backbone. This helps in exploiting local and global contextual feature dependencies in both spatial and frequency domains, thereby learning prominent and discriminant feature representations. A shared key-query scheme is introduced to learn complementary features while reducing the parameters. In addition, the DGTL leverages the benefits of a deformable convolution to enable the model to handle complex lesion irregularities. We evaluate our method on a benchmark dataset, and the experimental results and extensive comparisons with existing CNN and vision transformer-based approaches indicate its effectiveness for glaucoma stage classification. Also, the results on an unseen dataset demonstrate the generalizability of the model.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":"PP ","pages":""},"PeriodicalIF":6.7000,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Biomedical and Health Informatics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/JBHI.2025.3574997","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Classification of glaucoma stages remains challenging due to substantial inter-stage similarities, the presence of irrelevant features, and subtle lesion size, shape, and color variations in fundus images. For this purpose, few efforts have recently been made using traditional machine learning and deep learning models, specifically convolutional neural networks (CNN). While the conventional CNN models capture local contextual features within fixed receptive fields, they fail to exploit global contextual dependencies. Transformers, on the other hand, are capable of modeling global contextual information. However, they lack the ability to capture local contexts and merely focus on performing attention in the spatial domain, ignoring feature analysis in the frequency domain. To address these issues, we present a novel dual-domain global transformer network, Glaucoformer, to effectively classify glaucoma stages. Specifically, we propose a dual-domain global transformer layer (DGTL) consisting of dual-domain channel attention (DCA) and dual-domain spatial attention (DSA) with Fourier domain feature analyzer (FDFA) as the core component and integrated with a backbone. This helps in exploiting local and global contextual feature dependencies in both spatial and frequency domains, thereby learning prominent and discriminant feature representations. A shared key-query scheme is introduced to learn complementary features while reducing the parameters. In addition, the DGTL leverages the benefits of a deformable convolution to enable the model to handle complex lesion irregularities. We evaluate our method on a benchmark dataset, and the experimental results and extensive comparisons with existing CNN and vision transformer-based approaches indicate its effectiveness for glaucoma stage classification. Also, the results on an unseen dataset demonstrate the generalizability of the model.

查看原文本刊更多论文

青光眼成形器：用于广义青光眼分期分类的双域全局变压器网络。

青光眼分期的分类仍然具有挑战性，因为分期之间存在大量的相似性，存在不相关的特征，以及眼底图像中细微的病变大小、形状和颜色变化。为此目的，最近很少有人使用传统的机器学习和深度学习模型，特别是卷积神经网络（CNN）。虽然传统的CNN模型捕获固定接受域内的局部上下文特征，但它们无法利用全局上下文依赖性。另一方面，变形金刚能够对全局上下文信息进行建模。然而，它们缺乏捕捉局部上下文的能力，只专注于在空间域中进行注意，而忽略了频域的特征分析。为了解决这些问题，我们提出了一个新的双域全球变压器网络，Glaucoformer，有效地划分青光眼的阶段。具体来说，我们提出了一个由双域通道注意（DCA）和双域空间注意（DSA）组成的双域全局变压器层（DGTL），并以傅里叶域特征分析仪（FDFA）为核心组件，与主干集成。这有助于在空间和频率域中利用局部和全局上下文特征依赖，从而学习突出的和判别的特征表示。引入共享键查询方案，在减少参数的同时学习互补特征。此外，DGTL利用可变形卷积的优点，使模型能够处理复杂的病变不规则性。我们在一个基准数据集上评估了我们的方法，实验结果以及与现有的CNN和基于视觉变换的方法的广泛比较表明了它对青光眼分期分类的有效性。此外，在一个未知数据集上的结果也证明了模型的泛化性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Journal of Biomedical and Health Informatics COMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

CiteScore

13.60

自引率

6.50%

发文量

1151

期刊介绍： IEEE Journal of Biomedical and Health Informatics publishes original papers presenting recent advances where information and communication technologies intersect with health, healthcare, life sciences, and biomedicine. Topics include acquisition, transmission, storage, retrieval, management, and analysis of biomedical and health information. The journal covers applications of information technologies in healthcare, patient monitoring, preventive care, early disease diagnosis, therapy discovery, and personalized treatment protocols. It explores electronic medical and health records, clinical information systems, decision support systems, medical and biological imaging informatics, wearable systems, body area/sensor networks, and more. Integration-related topics like interoperability, evidence-based medicine, and secure patient data are also addressed.