{"title":"BrainDx: a dual-transformer framework using PVT and SegFormer for tumor diagnosis","authors":"Arshleen Kaur , Vinay Kukreja , Modafar Ati , Ankit Bansal , Shanmugasundaram Hariharan","doi":"10.1016/j.bspc.2025.108917","DOIUrl":null,"url":null,"abstract":"<div><h3>Context</h3><div>Brain tumor diagnosis is challenging due to their complex morphology, indistinct boundaries, and subtle variations in Magnetic Resonance Imaging (MRI) scans. Manual diagnosis is time-consuming and error-prone, making the need for automated systems crucial. Recent advancements in deep learning, particularly in transformer models, have led to improved accuracy and speed in medical image analysis.</div></div><div><h3>Objective</h3><div>This research aims to develop an Artificial Intelligencee (AI) based framework that integrates the Pyramid Vision Transformer (PVT) for tumor classification and the SegFormer for tumor segmentation, thereby enhancing diagnostic accuracy, speed, and reducing human error in brain tumor detection.</div></div><div><h3>Methodology</h3><div>The proposed framework, BrainDX, utilizes PVT to classify MRI images into tumor types (Gliomas, Meningiomas, Pituitary Tumors, and Healthy Brain), and SegFormer to segment tumor regions in real-time. The dataset consists of annotated MRI images that undergo preprocessing (normalization, resizing, and augmentation). The models are trained and evaluated based on performance metrics, including accuracy, Dice score, Intersection over Union (IoU), and segmentation time.</div></div><div><h3>Results</h3><div>The framework was evaluated across three benchmark MRI datasets, achieving a classification accuracy of 94.0% and a Dice score of 0.87 for tumor segmentation. SegFormer demonstrated real-time segmentation, processing MRI images in under 50 ms. Both models maintained high efficiency while delivering robust performance, even in cases of irregular tumor boundaries.</div></div><div><h3>Future Scope</h3><div>Future work will focus on further optimizing the model for real-time clinical use, improving generalization across diverse tumor types and MRI modalities. This AI-powered system has the potential to enhance diagnostic processes and improve patient outcomes significantly.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"113 ","pages":"Article 108917"},"PeriodicalIF":4.9000,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Signal Processing and Control","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1746809425014284","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Context
Brain tumor diagnosis is challenging due to their complex morphology, indistinct boundaries, and subtle variations in Magnetic Resonance Imaging (MRI) scans. Manual diagnosis is time-consuming and error-prone, making the need for automated systems crucial. Recent advancements in deep learning, particularly in transformer models, have led to improved accuracy and speed in medical image analysis.
Objective
This research aims to develop an Artificial Intelligencee (AI) based framework that integrates the Pyramid Vision Transformer (PVT) for tumor classification and the SegFormer for tumor segmentation, thereby enhancing diagnostic accuracy, speed, and reducing human error in brain tumor detection.
Methodology
The proposed framework, BrainDX, utilizes PVT to classify MRI images into tumor types (Gliomas, Meningiomas, Pituitary Tumors, and Healthy Brain), and SegFormer to segment tumor regions in real-time. The dataset consists of annotated MRI images that undergo preprocessing (normalization, resizing, and augmentation). The models are trained and evaluated based on performance metrics, including accuracy, Dice score, Intersection over Union (IoU), and segmentation time.
Results
The framework was evaluated across three benchmark MRI datasets, achieving a classification accuracy of 94.0% and a Dice score of 0.87 for tumor segmentation. SegFormer demonstrated real-time segmentation, processing MRI images in under 50 ms. Both models maintained high efficiency while delivering robust performance, even in cases of irregular tumor boundaries.
Future Scope
Future work will focus on further optimizing the model for real-time clinical use, improving generalization across diverse tumor types and MRI modalities. This AI-powered system has the potential to enhance diagnostic processes and improve patient outcomes significantly.
期刊介绍:
Biomedical Signal Processing and Control aims to provide a cross-disciplinary international forum for the interchange of information on research in the measurement and analysis of signals and images in clinical medicine and the biological sciences. Emphasis is placed on contributions dealing with the practical, applications-led research on the use of methods and devices in clinical diagnosis, patient monitoring and management.
Biomedical Signal Processing and Control reflects the main areas in which these methods are being used and developed at the interface of both engineering and clinical science. The scope of the journal is defined to include relevant review papers, technical notes, short communications and letters. Tutorial papers and special issues will also be published.