Medtransnet：用于医学图像分类的高级门控变压器网络

IF 2.3 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Vision and Applications Pub Date : 2024-05-29 DOI:10.1007/s00138-024-01542-2

Nagur Shareef Shaik, Teja Krishna Cherukuri, N Veeranjaneulu, Jyostna Devi Bodapati

{"title":"Medtransnet：用于医学图像分类的高级门控变压器网络","authors":"Nagur Shareef Shaik, Teja Krishna Cherukuri, N Veeranjaneulu, Jyostna Devi Bodapati","doi":"10.1007/s00138-024-01542-2","DOIUrl":null,"url":null,"abstract":"Accurate medical image classification poses a significant challenge in designing expert computer-aided diagnosis systems. While deep learning approaches have shown remarkable advancements over traditional techniques, addressing inter-class similarity and intra-class dissimilarity across medical imaging modalities remains challenging. This work introduces the advanced gating transformer network (MedTransNet), a deep learning model tailored for precise medical image classification. MedTransNet utilizes channel and multi-gate attention mechanisms, coupled with residual interconnections, to learn category-specific attention representations from diverse medical imaging modalities. Additionally, the use of gradient centralization during training helps in preventing overfitting and improving generalization, which is especially important in medical imaging applications where the availability of labeled data is often limited. Evaluation on benchmark datasets, including APTOS-2019, Figshare, and SARS-CoV-2, demonstrates effectiveness of the proposed MedTransNet across tasks such as diabetic retinopathy severity grading, multi-class brain tumor classification, and COVID-19 detection. Experimental results showcase MedTransNet achieving 85.68% accuracy for retinopathy grading, 98.37% (\\(\\pm \\,0.44\\)) for tumor classification, and 99.60% for COVID-19 detection, surpassing recent deep learning models. MedTransNet holds promise for significantly improving medical image classification accuracy.\n","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"55 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Medtransnet: advanced gating transformer network for medical image classification\",\"authors\":\"Nagur Shareef Shaik, Teja Krishna Cherukuri, N Veeranjaneulu, Jyostna Devi Bodapati\",\"doi\":\"10.1007/s00138-024-01542-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accurate medical image classification poses a significant challenge in designing expert computer-aided diagnosis systems. While deep learning approaches have shown remarkable advancements over traditional techniques, addressing inter-class similarity and intra-class dissimilarity across medical imaging modalities remains challenging. This work introduces the advanced gating transformer network (MedTransNet), a deep learning model tailored for precise medical image classification. MedTransNet utilizes channel and multi-gate attention mechanisms, coupled with residual interconnections, to learn category-specific attention representations from diverse medical imaging modalities. Additionally, the use of gradient centralization during training helps in preventing overfitting and improving generalization, which is especially important in medical imaging applications where the availability of labeled data is often limited. Evaluation on benchmark datasets, including APTOS-2019, Figshare, and SARS-CoV-2, demonstrates effectiveness of the proposed MedTransNet across tasks such as diabetic retinopathy severity grading, multi-class brain tumor classification, and COVID-19 detection. Experimental results showcase MedTransNet achieving 85.68% accuracy for retinopathy grading, 98.37% (\\\\(\\\\pm \\\\,0.44\\\\)) for tumor classification, and 99.60% for COVID-19 detection, surpassing recent deep learning models. MedTransNet holds promise for significantly improving medical image classification accuracy.\\n\",\"PeriodicalId\":51116,\"journal\":{\"name\":\"Machine Vision and Applications\",\"volume\":\"55 1\",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-05-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Vision and Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00138-024-01542-2\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Vision and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00138-024-01542-2","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

准确的医学图像分类是设计专家计算机辅助诊断系统的一大挑战。虽然深度学习方法与传统技术相比取得了显著进步，但解决医学影像模式中的类间相似性和类内不相似性问题仍具有挑战性。这项研究引入了高级门控变压器网络（MedTransNet），这是一种专为精确医学影像分类而定制的深度学习模型。MedTransNet 利用通道和多门注意机制以及残差互连，从不同的医学成像模式中学习特定类别的注意表征。此外，在训练过程中使用梯度集中化有助于防止过拟合和提高泛化能力，这在医疗成像应用中尤为重要，因为标注数据的可用性往往有限。在 APTOS-2019、Figshare 和 SARS-CoV-2 等基准数据集上进行的评估证明了所提出的 MedTransNet 在糖尿病视网膜病变严重程度分级、多类脑肿瘤分类和 COVID-19 检测等任务中的有效性。实验结果表明，MedTransNet 的视网膜病变分级准确率达到 85.68%，肿瘤分类准确率达到 98.37%，COVID-19 检测准确率达到 99.60%，超过了最近的深度学习模型。MedTransNet有望显著提高医学图像分类的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Medtransnet: advanced gating transformer network for medical image classification

查看原文本刊更多论文

Medtransnet: advanced gating transformer network for medical image classification

Accurate medical image classification poses a significant challenge in designing expert computer-aided diagnosis systems. While deep learning approaches have shown remarkable advancements over traditional techniques, addressing inter-class similarity and intra-class dissimilarity across medical imaging modalities remains challenging. This work introduces the advanced gating transformer network (MedTransNet), a deep learning model tailored for precise medical image classification. MedTransNet utilizes channel and multi-gate attention mechanisms, coupled with residual interconnections, to learn category-specific attention representations from diverse medical imaging modalities. Additionally, the use of gradient centralization during training helps in preventing overfitting and improving generalization, which is especially important in medical imaging applications where the availability of labeled data is often limited. Evaluation on benchmark datasets, including APTOS-2019, Figshare, and SARS-CoV-2, demonstrates effectiveness of the proposed MedTransNet across tasks such as diabetic retinopathy severity grading, multi-class brain tumor classification, and COVID-19 detection. Experimental results showcase MedTransNet achieving 85.68% accuracy for retinopathy grading, 98.37% (\(\pm \,0.44\)) for tumor classification, and 99.60% for COVID-19 detection, surpassing recent deep learning models. MedTransNet holds promise for significantly improving medical image classification accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Machine Vision and Applications 工程技术-工程：电子与电气

CiteScore

6.30

自引率

3.00%

发文量

审稿时长

8.7 months

期刊介绍： Machine Vision and Applications publishes high-quality technical contributions in machine vision research and development. Specifically, the editors encourage submittals in all applications and engineering aspects of image-related computing. In particular, original contributions dealing with scientific, commercial, industrial, military, and biomedical applications of machine vision, are all within the scope of the journal. Particular emphasis is placed on engineering and technology aspects of image processing and computer vision. The following aspects of machine vision applications are of interest: algorithms, architectures, VLSI implementations, AI techniques and expert systems for machine vision, front-end sensing, multidimensional and multisensor machine vision, real-time techniques, image databases, virtual reality and visualization. Papers must include a significant experimental validation component.