{"title":"Multiscale segmentation net for segregating heterogeneous brain tumors: Gliomas on multimodal MR images","authors":"","doi":"10.1016/j.imavis.2024.105191","DOIUrl":null,"url":null,"abstract":"<div><p>In this research, the 3D volumetric segmentation of heterogeneous brain tumors such as Gliomas- anaplastic astrocytoma, and Glioblastoma Multiforme (GBM) is performed to extract enhancing tumor (ET), whole tumor (WT), and tumor core (TC) regions using T1, T2, and FLAIR images. Therefore, a deep learning-based encoder-decoder architecture named “MS-SegNet” using 3D multi-scale convolutional layers is proposed. The proposed architecture employs multi-scale feature extraction (MS-FE) block the filter size 3 × 3 × 3 to extract confined information like tumor boundary and edges of necrotic part. The filter of size 5 × 5 × 5 focuses on varied features like shape, size, and location of tumor region with edema. The local and global features from different MR modalities are extracted for segmenting thin and meshed boundaries of heterogeneous tumors between anatomical sub-regions like peritumoral edema, enhancing tumor, and necrotic tumor core. The learning parameters on introducing the MS-FE block are reduced to 10 million, which is much less than other architectures like 3D-Unet which takes into consideration 27 million features leading to the consumption of less computational power. A customized loss function is also prophesied based on a combination of dice loss and focal loss along with metrics such as accuracy and Intersection over Union (IoU) i.e. the overlapping of ground truth mask and predicted value for addressing the class imbalance problem. For evaluating the efficacy of the proposed method, four evaluation metrics such as Dice Coefficient (DSC), Sensitivity, Specificity, and Hausdorff95 distance (H95) are employed for analyzing the model's overall performance. It is observed that the proposed MS-SegNet architecture achieved the DSC of 0.81, 0.91, and 0.83 on BraTS 2020; 0.86, 0.92, and 0.84 on BraTS 2021 for ET, WT, and TC respectively. The developed model is also tested on a real-time dataset collected from the Post Graduate Institute of Medical Education & Research (PGIMER), Chandigarh. The DSC of 0.79, 0.76, and 0.68 for ET, WT, and TC respectively on the real-time dataset. These findings show that deep learning models with enhanced feature extraction capabilities can be readily trained to attain high accuracy in segmenting heterogeneous brain tumors and hold promising results. In the future, other tumor datasets will be explored for the detection and treatment planning of brain tumors to check the effectiveness of the model in real-world healthcare environments.</p></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":null,"pages":null},"PeriodicalIF":4.2000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885624002968","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In this research, the 3D volumetric segmentation of heterogeneous brain tumors such as Gliomas- anaplastic astrocytoma, and Glioblastoma Multiforme (GBM) is performed to extract enhancing tumor (ET), whole tumor (WT), and tumor core (TC) regions using T1, T2, and FLAIR images. Therefore, a deep learning-based encoder-decoder architecture named “MS-SegNet” using 3D multi-scale convolutional layers is proposed. The proposed architecture employs multi-scale feature extraction (MS-FE) block the filter size 3 × 3 × 3 to extract confined information like tumor boundary and edges of necrotic part. The filter of size 5 × 5 × 5 focuses on varied features like shape, size, and location of tumor region with edema. The local and global features from different MR modalities are extracted for segmenting thin and meshed boundaries of heterogeneous tumors between anatomical sub-regions like peritumoral edema, enhancing tumor, and necrotic tumor core. The learning parameters on introducing the MS-FE block are reduced to 10 million, which is much less than other architectures like 3D-Unet which takes into consideration 27 million features leading to the consumption of less computational power. A customized loss function is also prophesied based on a combination of dice loss and focal loss along with metrics such as accuracy and Intersection over Union (IoU) i.e. the overlapping of ground truth mask and predicted value for addressing the class imbalance problem. For evaluating the efficacy of the proposed method, four evaluation metrics such as Dice Coefficient (DSC), Sensitivity, Specificity, and Hausdorff95 distance (H95) are employed for analyzing the model's overall performance. It is observed that the proposed MS-SegNet architecture achieved the DSC of 0.81, 0.91, and 0.83 on BraTS 2020; 0.86, 0.92, and 0.84 on BraTS 2021 for ET, WT, and TC respectively. The developed model is also tested on a real-time dataset collected from the Post Graduate Institute of Medical Education & Research (PGIMER), Chandigarh. The DSC of 0.79, 0.76, and 0.68 for ET, WT, and TC respectively on the real-time dataset. These findings show that deep learning models with enhanced feature extraction capabilities can be readily trained to attain high accuracy in segmenting heterogeneous brain tumors and hold promising results. In the future, other tumor datasets will be explored for the detection and treatment planning of brain tumors to check the effectiveness of the model in real-world healthcare environments.
期刊介绍:
Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.