Aditya Pal, Hari Mohan Rai, Joon Yoo, Sang-Ryong Lee, Yooheon Park
{"title":"ViT-DCNN:用于肺癌和结肠癌检测的可变形CNN模型视觉变压器。","authors":"Aditya Pal, Hari Mohan Rai, Joon Yoo, Sang-Ryong Lee, Yooheon Park","doi":"10.3390/cancers17183005","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background/Objectives:</b> Lung and colon cancers remain among the most prevalent and fatal diseases worldwide, and their early detection is a serious challenge. The data used in this study was obtained from the Lung and Colon Cancer Histopathological Images Dataset, which comprises five different classes of image data, namely colon adenocarcinoma, colon normal, lung adenocarcinoma, lung normal, and lung squamous cell carcinoma, split into training (80%), validation (10%), and test (10%) subsets. In this study, we propose the ViT-DCNN (Vision Transformer with Deformable CNN) model, with the aim of improving cancer detection and classification using medical images. <b>Methods:</b> The combination of the ViT's self-attention capabilities with deformable convolutions allows for improved feature extraction, while also enabling the model to learn both holistic contextual information as well as fine-grained localized spatial details. <b>Results:</b> On the test set, the model performed remarkably well, with an accuracy of 94.24%, an F1 score of 94.23%, recall of 94.24%, and precision of 94.37%, confirming its robustness in detecting cancerous tissues. Furthermore, our proposed ViT-DCNN model outperforms several state-of-the-art models, including ResNet-152, EfficientNet-B7, SwinTransformer, DenseNet-201, ConvNext, TransUNet, CNN-LSTM, MobileNetV3, and NASNet-A, across all major performance metrics. <b>Conclusions:</b> By using deep learning and advanced image analysis, this model enhances the efficiency of cancer detection, thus representing a valuable tool for radiologists and clinicians. This study demonstrates that the proposed ViT-DCNN model can reduce diagnostic inaccuracies and improve detection efficiency. Future work will focus on dataset enrichment and enhancing the model's interpretability to evaluate its clinical applicability. This paper demonstrates the promise of artificial-intelligence-driven diagnostic models in transforming lung and colon cancer detection and improving patient diagnosis.</p>","PeriodicalId":9681,"journal":{"name":"Cancers","volume":"17 18","pages":""},"PeriodicalIF":4.4000,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12468260/pdf/","citationCount":"0","resultStr":"{\"title\":\"ViT-DCNN: Vision Transformer with Deformable CNN Model for Lung and Colon Cancer Detection.\",\"authors\":\"Aditya Pal, Hari Mohan Rai, Joon Yoo, Sang-Ryong Lee, Yooheon Park\",\"doi\":\"10.3390/cancers17183005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b>Background/Objectives:</b> Lung and colon cancers remain among the most prevalent and fatal diseases worldwide, and their early detection is a serious challenge. The data used in this study was obtained from the Lung and Colon Cancer Histopathological Images Dataset, which comprises five different classes of image data, namely colon adenocarcinoma, colon normal, lung adenocarcinoma, lung normal, and lung squamous cell carcinoma, split into training (80%), validation (10%), and test (10%) subsets. In this study, we propose the ViT-DCNN (Vision Transformer with Deformable CNN) model, with the aim of improving cancer detection and classification using medical images. <b>Methods:</b> The combination of the ViT's self-attention capabilities with deformable convolutions allows for improved feature extraction, while also enabling the model to learn both holistic contextual information as well as fine-grained localized spatial details. <b>Results:</b> On the test set, the model performed remarkably well, with an accuracy of 94.24%, an F1 score of 94.23%, recall of 94.24%, and precision of 94.37%, confirming its robustness in detecting cancerous tissues. Furthermore, our proposed ViT-DCNN model outperforms several state-of-the-art models, including ResNet-152, EfficientNet-B7, SwinTransformer, DenseNet-201, ConvNext, TransUNet, CNN-LSTM, MobileNetV3, and NASNet-A, across all major performance metrics. <b>Conclusions:</b> By using deep learning and advanced image analysis, this model enhances the efficiency of cancer detection, thus representing a valuable tool for radiologists and clinicians. This study demonstrates that the proposed ViT-DCNN model can reduce diagnostic inaccuracies and improve detection efficiency. Future work will focus on dataset enrichment and enhancing the model's interpretability to evaluate its clinical applicability. This paper demonstrates the promise of artificial-intelligence-driven diagnostic models in transforming lung and colon cancer detection and improving patient diagnosis.</p>\",\"PeriodicalId\":9681,\"journal\":{\"name\":\"Cancers\",\"volume\":\"17 18\",\"pages\":\"\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12468260/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cancers\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3390/cancers17183005\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancers","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3390/cancers17183005","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
背景/目的:肺癌和结肠癌仍然是世界上最普遍和最致命的疾病之一,它们的早期发现是一项严峻的挑战。本研究中使用的数据来自肺癌和结肠癌组织病理学图像数据集,该数据集包括五类不同的图像数据,即结肠腺癌、结肠正常、肺腺癌、肺正常和肺鳞状细胞癌,分为训练(80%)、验证(10%)和测试(10%)亚集。在这项研究中,我们提出了viti - dcnn (Vision Transformer with Deformable CNN)模型,旨在提高医学图像对癌症的检测和分类。方法:将ViT的自关注能力与可变形卷积相结合,可以改进特征提取,同时也使模型能够学习整体上下文信息以及细粒度的局部空间细节。结果:在测试集上,该模型表现非常好,准确率为94.24%,F1得分为94.23%,召回率为94.24%,精度为94.37%,证实了其在检测癌组织方面的稳健性。此外,我们提出的ViT-DCNN模型在所有主要性能指标上都优于几种最先进的模型,包括ResNet-152、EfficientNet-B7、SwinTransformer、DenseNet-201、ConvNext、TransUNet、CNN-LSTM、MobileNetV3和NASNet-A。结论:该模型通过使用深度学习和先进的图像分析,提高了癌症检测的效率,是放射科医生和临床医生的一个有价值的工具。研究表明,所提出的ViT-DCNN模型可以降低诊断的不准确性,提高检测效率。未来的工作将集中在数据集的丰富和增强模型的可解释性,以评估其临床适用性。本文展示了人工智能驱动的诊断模型在改变肺癌和结肠癌检测和改善患者诊断方面的前景。
ViT-DCNN: Vision Transformer with Deformable CNN Model for Lung and Colon Cancer Detection.
Background/Objectives: Lung and colon cancers remain among the most prevalent and fatal diseases worldwide, and their early detection is a serious challenge. The data used in this study was obtained from the Lung and Colon Cancer Histopathological Images Dataset, which comprises five different classes of image data, namely colon adenocarcinoma, colon normal, lung adenocarcinoma, lung normal, and lung squamous cell carcinoma, split into training (80%), validation (10%), and test (10%) subsets. In this study, we propose the ViT-DCNN (Vision Transformer with Deformable CNN) model, with the aim of improving cancer detection and classification using medical images. Methods: The combination of the ViT's self-attention capabilities with deformable convolutions allows for improved feature extraction, while also enabling the model to learn both holistic contextual information as well as fine-grained localized spatial details. Results: On the test set, the model performed remarkably well, with an accuracy of 94.24%, an F1 score of 94.23%, recall of 94.24%, and precision of 94.37%, confirming its robustness in detecting cancerous tissues. Furthermore, our proposed ViT-DCNN model outperforms several state-of-the-art models, including ResNet-152, EfficientNet-B7, SwinTransformer, DenseNet-201, ConvNext, TransUNet, CNN-LSTM, MobileNetV3, and NASNet-A, across all major performance metrics. Conclusions: By using deep learning and advanced image analysis, this model enhances the efficiency of cancer detection, thus representing a valuable tool for radiologists and clinicians. This study demonstrates that the proposed ViT-DCNN model can reduce diagnostic inaccuracies and improve detection efficiency. Future work will focus on dataset enrichment and enhancing the model's interpretability to evaluate its clinical applicability. This paper demonstrates the promise of artificial-intelligence-driven diagnostic models in transforming lung and colon cancer detection and improving patient diagnosis.
期刊介绍:
Cancers (ISSN 2072-6694) is an international, peer-reviewed open access journal on oncology. It publishes reviews, regular research papers and short communications. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced.