Xing Wang, Houde Wu, Longshuang Wang, Jingxu Chen, Yi Li, Xinliu He, Ting Chen, Minghui Wang, Li Guo
{"title":"Enhanced pulmonary nodule detection with U-Net, YOLOv8, and swin transformer.","authors":"Xing Wang, Houde Wu, Longshuang Wang, Jingxu Chen, Yi Li, Xinliu He, Ting Chen, Minghui Wang, Li Guo","doi":"10.1186/s12880-025-01784-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Rationale and objectives: </strong>Lung cancer remains the leading cause of cancer-related mortality worldwide, emphasizing the critical need for early pulmonary nodule detection to improve patient outcomes. Current methods encounter challenges in detecting small nodules and exhibit high false positive rates, placing an additional diagnostic burden on radiologists. This study aimed to develop a two-stage deep learning model integrating U-Net, Yolov8s, and the Swin transformer to enhance pulmonary nodule detection in computer tomography (CT) images, particularly for small nodules, with the goal of improving detection accuracy and reducing false positives.</p><p><strong>Materials and methods: </strong>We utilized the LUNA16 dataset (888 CT scans) and an additional 308 CT scans from Tianjin Chest Hospital. Images were preprocessed for consistency. The proposed model first employs U-Net for precise lung segmentation, followed by Yolov8s augmented with the Swin transformer for nodule detection. The Shape-aware IoU (SIoU) loss function was implemented to improve bounding box predictions.</p><p><strong>Results: </strong>For the LUNA16 dataset, the model achieved a precision of 0.898, a recall of 0.851, and a mean average precision at 50% IoU (mAP50) of 0.879, outperforming state-of-the-art models. The Tianjin Chest Hospital dataset has a precision of 0.855, a recall of 0.872, and an mAP50 of 0.862.</p><p><strong>Conclusion: </strong>This study presents a two-stage deep learning model that leverages U-Net, Yolov8s, and the Swin transformer for enhanced pulmonary nodule detection in CT images. The model demonstrates high accuracy and a reduced false positive rate, suggesting its potential as a useful tool for early lung cancer diagnosis and treatment.</p>","PeriodicalId":9020,"journal":{"name":"BMC Medical Imaging","volume":"25 1","pages":"247"},"PeriodicalIF":2.9000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12880-025-01784-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Rationale and objectives: Lung cancer remains the leading cause of cancer-related mortality worldwide, emphasizing the critical need for early pulmonary nodule detection to improve patient outcomes. Current methods encounter challenges in detecting small nodules and exhibit high false positive rates, placing an additional diagnostic burden on radiologists. This study aimed to develop a two-stage deep learning model integrating U-Net, Yolov8s, and the Swin transformer to enhance pulmonary nodule detection in computer tomography (CT) images, particularly for small nodules, with the goal of improving detection accuracy and reducing false positives.
Materials and methods: We utilized the LUNA16 dataset (888 CT scans) and an additional 308 CT scans from Tianjin Chest Hospital. Images were preprocessed for consistency. The proposed model first employs U-Net for precise lung segmentation, followed by Yolov8s augmented with the Swin transformer for nodule detection. The Shape-aware IoU (SIoU) loss function was implemented to improve bounding box predictions.
Results: For the LUNA16 dataset, the model achieved a precision of 0.898, a recall of 0.851, and a mean average precision at 50% IoU (mAP50) of 0.879, outperforming state-of-the-art models. The Tianjin Chest Hospital dataset has a precision of 0.855, a recall of 0.872, and an mAP50 of 0.862.
Conclusion: This study presents a two-stage deep learning model that leverages U-Net, Yolov8s, and the Swin transformer for enhanced pulmonary nodule detection in CT images. The model demonstrates high accuracy and a reduced false positive rate, suggesting its potential as a useful tool for early lung cancer diagnosis and treatment.
期刊介绍:
BMC Medical Imaging is an open access journal publishing original peer-reviewed research articles in the development, evaluation, and use of imaging techniques and image processing tools to diagnose and manage disease.