{"title":"From Image to Sequence: Exploring Vision Transformers for Optical Coherence Tomography Classification.","authors":"Amirali Arbab, Aref Habibi, Hossein Rabbani, Mahnoosh Tajmirriahi","doi":"10.4103/jmss.jmss_58_24","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Optical coherence tomography (OCT) is a pivotal imaging technique for the early detection and management of critical retinal diseases, notably diabetic macular edema and age-related macular degeneration. These conditions are significant global health concerns, affecting millions and leading to vision loss if not diagnosed promptly. Current methods for OCT image classification encounter specific challenges, such as the inherent complexity of retinal structures and considerable variability across different OCT datasets.</p><p><strong>Methods: </strong>This paper introduces a novel hybrid model that integrates the strengths of convolutional neural networks (CNNs) and vision transformer (ViT) to overcome these obstacles. The synergy between CNNs, which excel at extracting detailed localized features, and ViT, adept at recognizing long-range patterns, enables a more effective and comprehensive analysis of OCT images.</p><p><strong>Results: </strong>While our model achieves an accuracy of 99.80% on the OCT2017 dataset, its standout feature is its parameter efficiency-requiring only 6.9 million parameters, significantly fewer than larger, more complex models such as Xception and OpticNet-71.</p><p><strong>Conclusion: </strong>This efficiency underscores the model's suitability for clinical settings, where computational resources may be limited but high accuracy and rapid diagnosis are imperative.<b>Code Availability:</b> The code for this study is available at https://github.com/Amir1831/ViT4OCT.</p>","PeriodicalId":37680,"journal":{"name":"Journal of Medical Signals & Sensors","volume":"15 ","pages":"18"},"PeriodicalIF":1.1000,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12180780/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Signals & Sensors","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4103/jmss.jmss_58_24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q4","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Optical coherence tomography (OCT) is a pivotal imaging technique for the early detection and management of critical retinal diseases, notably diabetic macular edema and age-related macular degeneration. These conditions are significant global health concerns, affecting millions and leading to vision loss if not diagnosed promptly. Current methods for OCT image classification encounter specific challenges, such as the inherent complexity of retinal structures and considerable variability across different OCT datasets.
Methods: This paper introduces a novel hybrid model that integrates the strengths of convolutional neural networks (CNNs) and vision transformer (ViT) to overcome these obstacles. The synergy between CNNs, which excel at extracting detailed localized features, and ViT, adept at recognizing long-range patterns, enables a more effective and comprehensive analysis of OCT images.
Results: While our model achieves an accuracy of 99.80% on the OCT2017 dataset, its standout feature is its parameter efficiency-requiring only 6.9 million parameters, significantly fewer than larger, more complex models such as Xception and OpticNet-71.
Conclusion: This efficiency underscores the model's suitability for clinical settings, where computational resources may be limited but high accuracy and rapid diagnosis are imperative.Code Availability: The code for this study is available at https://github.com/Amir1831/ViT4OCT.
期刊介绍:
JMSS is an interdisciplinary journal that incorporates all aspects of the biomedical engineering including bioelectrics, bioinformatics, medical physics, health technology assessment, etc. Subject areas covered by the journal include: - Bioelectric: Bioinstruments Biosensors Modeling Biomedical signal processing Medical image analysis and processing Medical imaging devices Control of biological systems Neuromuscular systems Cognitive sciences Telemedicine Robotic Medical ultrasonography Bioelectromagnetics Electrophysiology Cell tracking - Bioinformatics and medical informatics: Analysis of biological data Data mining Stochastic modeling Computational genomics Artificial intelligence & fuzzy Applications Medical softwares Bioalgorithms Electronic health - Biophysics and medical physics: Computed tomography Radiation therapy Laser therapy - Education in biomedical engineering - Health technology assessment - Standard in biomedical engineering.