Modified ResNet model for medical image-based lung cancer detection

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Image and Vision Computing Pub Date : 2025-10-03 DOI:10.1016/j.imavis.2025.105752

Zeyad Q. Habeeb , Branislav Vuksanovic , Imad Q. Alzaydi

{"title":"Modified ResNet model for medical image-based lung cancer detection","authors":"Zeyad Q. Habeeb , Branislav Vuksanovic , Imad Q. Alzaydi","doi":"10.1016/j.imavis.2025.105752","DOIUrl":null,"url":null,"abstract":"<div><div>Lung cancer is still the most common cause of tumor death in the world. Therefore, there is a great demand to develop diagnostic tools for lung cancer. This research proposes a diagnostically tuned modified ResNet 50 model for detecting and diagnosing lung cancer from chest X-ray images. The architecture of ResNet 50 is adapted to be more suitable for the unique challenges presented by medical imaging data. The modifications include adding extra batch normalization layers for stabilizing training, replacing fully connected layers with global average pooling to reduce overfitting, and adding a squeeze-and-excitation (SE) block that enhances the model's focus on key features such as nodules and lesions. Furthermore, transfer learning was performed on the pre-trained ResNet 50 weights, and the model was fine-tuned to the dataset of images of lungs for better sensitivity regarding cancerous patterns. This modified ResNet 50 was evaluated on a publicly available dataset of lung images from the JSRT dataset, which outperforms the original ResNet 50 and state-of-the-art research. The proposed model achieves high sensitivity, specificity, precision, F1-score and accuracy, which are considered the most important factors in clinical settings. Accuracy reached as high as 98.77% in the detection of lung cancer, as shown by the results. The results also show that the modified ResNet model can be a highly reliable and efficient tool for the early detection of lung cancer. As a result, the improved architecture leads to better diagnostic accuracy and reduced computational complexity so it can be used in medical imaging with real-time applications.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"163 ","pages":"Article 105752"},"PeriodicalIF":4.2000,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885625003403","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Lung cancer is still the most common cause of tumor death in the world. Therefore, there is a great demand to develop diagnostic tools for lung cancer. This research proposes a diagnostically tuned modified ResNet 50 model for detecting and diagnosing lung cancer from chest X-ray images. The architecture of ResNet 50 is adapted to be more suitable for the unique challenges presented by medical imaging data. The modifications include adding extra batch normalization layers for stabilizing training, replacing fully connected layers with global average pooling to reduce overfitting, and adding a squeeze-and-excitation (SE) block that enhances the model's focus on key features such as nodules and lesions. Furthermore, transfer learning was performed on the pre-trained ResNet 50 weights, and the model was fine-tuned to the dataset of images of lungs for better sensitivity regarding cancerous patterns. This modified ResNet 50 was evaluated on a publicly available dataset of lung images from the JSRT dataset, which outperforms the original ResNet 50 and state-of-the-art research. The proposed model achieves high sensitivity, specificity, precision, F1-score and accuracy, which are considered the most important factors in clinical settings. Accuracy reached as high as 98.77% in the detection of lung cancer, as shown by the results. The results also show that the modified ResNet model can be a highly reliable and efficient tool for the early detection of lung cancer. As a result, the improved architecture leads to better diagnostic accuracy and reduced computational complexity so it can be used in medical imaging with real-time applications.

查看原文本刊更多论文

基于医学图像的肺癌检测改进的ResNet模型

肺癌仍然是世界上最常见的肿瘤死亡原因。因此，开发肺癌诊断工具的需求很大。本研究提出了一种诊断调谐的改进的ResNet 50模型，用于从胸部x线图像中检测和诊断肺癌。ResNet 50的架构经过调整，更适合医学影像数据所带来的独特挑战。这些修改包括增加额外的批归一化层以稳定训练，用全局平均池化取代完全连接层以减少过拟合，以及增加一个挤压和激励（SE）块以增强模型对关键特征（如结节和病变）的关注。此外，在预训练的ResNet 50权重上进行迁移学习，并对模型进行微调，以适应肺部图像数据集，以提高对癌症模式的敏感性。改进后的ResNet 50在来自JSRT数据集的公开可用的肺图像数据集上进行了评估，优于原始的ResNet 50和最先进的研究。该模型具有较高的敏感性、特异性、精密度、f1评分和准确性，这些都是临床环境中最重要的因素。结果表明，该方法对肺癌的检测准确率高达98.77%。结果还表明，改进后的ResNet模型可作为一种高可靠、高效的肺癌早期检测工具。因此，改进的体系结构提高了诊断准确性，降低了计算复杂性，因此可以用于具有实时应用的医学成像。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Image and Vision Computing 工程技术-工程：电子与电气

CiteScore

8.50

自引率

8.50%

发文量

143

审稿时长

7.8 months

期刊介绍： Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.