Nayef Alqahtani, Arfat Ahmad Khan, Rakesh Kumar Mahendran, Muhammad Faheem
{"title":"VSMI 2 ${\\text{VSMI}}^{\\mathbf{2}}$ -PANet:基于多模态成像技术的多功能尺度可伸缩图像集成和补丁智能关注网络","authors":"Nayef Alqahtani, Arfat Ahmad Khan, Rakesh Kumar Mahendran, Muhammad Faheem","doi":"10.1049/cit2.70039","DOIUrl":null,"url":null,"abstract":"<p>Lung cancer (LC) is a major cancer which accounts for higher mortality rates worldwide. Doctors utilise many imaging modalities for identifying lung tumours and their severity in earlier stages. Nowadays, machine learning (ML) and deep learning (DL) methodologies are utilised for the robust detection and prediction of lung tumours. Recently, multi modal imaging emerged as a robust technique for lung tumour detection by combining various imaging features. To cope with that, we propose a novel multi modal imaging technique named versatile scale malleable image integration and patch wise attention network (<span></span><math>\n <semantics>\n <mrow>\n <msup>\n <mtext>VSMI</mtext>\n <mn>2</mn>\n </msup>\n <mo>−</mo>\n <mtext>PANet</mtext>\n </mrow>\n <annotation> ${\\text{VSMI}}^{2}-\\text{PANet}$</annotation>\n </semantics></math>) which adopts three imaging modalities named computed tomography (CT), magnetic resonance imaging (MRI) and single photon emission computed tomography (SPECT). The designed model accepts input from CT and MRI images and passes it to the <span></span><math>\n <semantics>\n <mrow>\n <msup>\n <mtext>VSMI</mtext>\n <mn>2</mn>\n </msup>\n </mrow>\n <annotation> ${\\text{VSMI}}^{2}$</annotation>\n </semantics></math> module that is composed of three sub-modules named image cropping module, scale malleable convolution layer (SMCL) and PANet module. CT and MRI images are subjected to image cropping module in a parallel manner to crop the meaningful image patches and provide them to the SMCL module. The SMCL module is composed of adaptive convolutional layers that investigate those patches in a parallel manner by preserving the spatial information. The output from the SMCL is then fused and provided to the PANet module. The PANet module examines the fused patches by analysing its height, width and channels of the image patch. As a result, it provides an output as high-resolution spatial attention maps indicating the location of suspicious tumours. The high-resolution spatial attention maps are then provided as an input to the backbone module which uses light wave transformer (LWT) for segmenting the lung tumours into three classes, such as normal, benign and malignant. In addition, the LWT also accepts SPECT image as input for capturing the variations precisely to segment the lung tumours. The performance of the proposed model is validated using several performance metrics, such as accuracy, precision, recall, <i>F</i>1-score and AUC curve, and the results show that the proposed work outperforms the existing approaches.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 5","pages":"1376-1393"},"PeriodicalIF":7.3000,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70039","citationCount":"0","resultStr":"{\"title\":\"VSMI\\n 2\\n \\n \\n ${\\\\text{VSMI}}^{\\\\mathbf{2}}$\\n -PANet: Versatile Scale-Malleable Image Integration and Patch Wise Attention Network With Transformer for Lung Tumour Segmentation Using Multi-Modal Imaging Techniques\",\"authors\":\"Nayef Alqahtani, Arfat Ahmad Khan, Rakesh Kumar Mahendran, Muhammad Faheem\",\"doi\":\"10.1049/cit2.70039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Lung cancer (LC) is a major cancer which accounts for higher mortality rates worldwide. Doctors utilise many imaging modalities for identifying lung tumours and their severity in earlier stages. Nowadays, machine learning (ML) and deep learning (DL) methodologies are utilised for the robust detection and prediction of lung tumours. Recently, multi modal imaging emerged as a robust technique for lung tumour detection by combining various imaging features. To cope with that, we propose a novel multi modal imaging technique named versatile scale malleable image integration and patch wise attention network (<span></span><math>\\n <semantics>\\n <mrow>\\n <msup>\\n <mtext>VSMI</mtext>\\n <mn>2</mn>\\n </msup>\\n <mo>−</mo>\\n <mtext>PANet</mtext>\\n </mrow>\\n <annotation> ${\\\\text{VSMI}}^{2}-\\\\text{PANet}$</annotation>\\n </semantics></math>) which adopts three imaging modalities named computed tomography (CT), magnetic resonance imaging (MRI) and single photon emission computed tomography (SPECT). The designed model accepts input from CT and MRI images and passes it to the <span></span><math>\\n <semantics>\\n <mrow>\\n <msup>\\n <mtext>VSMI</mtext>\\n <mn>2</mn>\\n </msup>\\n </mrow>\\n <annotation> ${\\\\text{VSMI}}^{2}$</annotation>\\n </semantics></math> module that is composed of three sub-modules named image cropping module, scale malleable convolution layer (SMCL) and PANet module. CT and MRI images are subjected to image cropping module in a parallel manner to crop the meaningful image patches and provide them to the SMCL module. The SMCL module is composed of adaptive convolutional layers that investigate those patches in a parallel manner by preserving the spatial information. The output from the SMCL is then fused and provided to the PANet module. The PANet module examines the fused patches by analysing its height, width and channels of the image patch. As a result, it provides an output as high-resolution spatial attention maps indicating the location of suspicious tumours. The high-resolution spatial attention maps are then provided as an input to the backbone module which uses light wave transformer (LWT) for segmenting the lung tumours into three classes, such as normal, benign and malignant. In addition, the LWT also accepts SPECT image as input for capturing the variations precisely to segment the lung tumours. The performance of the proposed model is validated using several performance metrics, such as accuracy, precision, recall, <i>F</i>1-score and AUC curve, and the results show that the proposed work outperforms the existing approaches.</p>\",\"PeriodicalId\":46211,\"journal\":{\"name\":\"CAAI Transactions on Intelligence Technology\",\"volume\":\"10 5\",\"pages\":\"1376-1393\"},\"PeriodicalIF\":7.3000,\"publicationDate\":\"2025-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70039\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CAAI Transactions on Intelligence Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/cit2.70039\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CAAI Transactions on Intelligence Technology","FirstCategoryId":"94","ListUrlMain":"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/cit2.70039","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
VSMI
2
${\text{VSMI}}^{\mathbf{2}}$
-PANet: Versatile Scale-Malleable Image Integration and Patch Wise Attention Network With Transformer for Lung Tumour Segmentation Using Multi-Modal Imaging Techniques
Lung cancer (LC) is a major cancer which accounts for higher mortality rates worldwide. Doctors utilise many imaging modalities for identifying lung tumours and their severity in earlier stages. Nowadays, machine learning (ML) and deep learning (DL) methodologies are utilised for the robust detection and prediction of lung tumours. Recently, multi modal imaging emerged as a robust technique for lung tumour detection by combining various imaging features. To cope with that, we propose a novel multi modal imaging technique named versatile scale malleable image integration and patch wise attention network () which adopts three imaging modalities named computed tomography (CT), magnetic resonance imaging (MRI) and single photon emission computed tomography (SPECT). The designed model accepts input from CT and MRI images and passes it to the module that is composed of three sub-modules named image cropping module, scale malleable convolution layer (SMCL) and PANet module. CT and MRI images are subjected to image cropping module in a parallel manner to crop the meaningful image patches and provide them to the SMCL module. The SMCL module is composed of adaptive convolutional layers that investigate those patches in a parallel manner by preserving the spatial information. The output from the SMCL is then fused and provided to the PANet module. The PANet module examines the fused patches by analysing its height, width and channels of the image patch. As a result, it provides an output as high-resolution spatial attention maps indicating the location of suspicious tumours. The high-resolution spatial attention maps are then provided as an input to the backbone module which uses light wave transformer (LWT) for segmenting the lung tumours into three classes, such as normal, benign and malignant. In addition, the LWT also accepts SPECT image as input for capturing the variations precisely to segment the lung tumours. The performance of the proposed model is validated using several performance metrics, such as accuracy, precision, recall, F1-score and AUC curve, and the results show that the proposed work outperforms the existing approaches.
期刊介绍:
CAAI Transactions on Intelligence Technology is a leading venue for original research on the theoretical and experimental aspects of artificial intelligence technology. We are a fully open access journal co-published by the Institution of Engineering and Technology (IET) and the Chinese Association for Artificial Intelligence (CAAI) providing research which is openly accessible to read and share worldwide.