使用国家肺部筛查试验队列的深度学习方法预测肺癌风险的基准。

IF 3.9 2区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES

Scientific Reports Pub Date : 2025-01-11 DOI:10.1038/s41598-024-84193-7

Yifan Jiang, Leyla Ebrahimpour, Philippe Després, Venkata Sk Manem

{"title":"使用国家肺部筛查试验队列的深度学习方法预测肺癌风险的基准。","authors":"Yifan Jiang, Leyla Ebrahimpour, Philippe Després, Venkata Sk Manem","doi":"10.1038/s41598-024-84193-7","DOIUrl":null,"url":null,"abstract":"Deep learning (DL) methods have demonstrated remarkable effectiveness in assisting with lung cancer risk prediction tasks using computed tomography (CT) scans. However, the lack of comprehensive comparison and validation of state-of-the-art (SOTA) models in practical settings limits their clinical application. This study aims to review and analyze current SOTA deep learning models for lung cancer risk prediction (malignant-benign classification). To evaluate our model's general performance, we selected 253 out of 467 patients from a subset of the National Lung Screening Trial (NLST) who had CT scans without contrast, which are the most commonly used, and divided them into training and test cohorts. The CT scans were preprocessed into 2D-image and 3D-volume formats according to their nodule annotations. We evaluated ten 3D and eleven 2D SOTA deep learning models, which were pretrained on large-scale general-purpose datasets (Kinetics and ImageNet) and radiological datasets (3DSeg-8, nnUnet and RadImageNet), for their lung cancer risk prediction performance. Our results showed that 3D-based deep learning models generally perform better than 2D models. On the test cohort, the best-performing 3D model achieved an AUROC of 0.86, while the best 2D model reached 0.79. The lowest AUROCs for the 3D and 2D models were 0.70 and 0.62, respectively. Furthermore, pretraining on large-scale radiological image datasets did not show the expected performance advantage over pretraining on general-purpose datasets. Both 2D and 3D deep learning models can handle lung cancer risk prediction tasks effectively, although 3D models generally have superior performance than their 2D competitors. Our findings highlight the importance of carefully selecting pretrained datasets and model architectures for lung cancer risk prediction. Overall, these results have important implications for the development and clinical integration of DL-based tools in lung cancer screening.","PeriodicalId":21811,"journal":{"name":"Scientific Reports","volume":"15 1","pages":"1736"},"PeriodicalIF":3.9000,"publicationDate":"2025-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11724919/pdf/","citationCount":"0","resultStr":"{\"title\":\"A benchmark of deep learning approaches to predict lung cancer risk using national lung screening trial cohort.\",\"authors\":\"Yifan Jiang, Leyla Ebrahimpour, Philippe Després, Venkata Sk Manem\",\"doi\":\"10.1038/s41598-024-84193-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning (DL) methods have demonstrated remarkable effectiveness in assisting with lung cancer risk prediction tasks using computed tomography (CT) scans. However, the lack of comprehensive comparison and validation of state-of-the-art (SOTA) models in practical settings limits their clinical application. This study aims to review and analyze current SOTA deep learning models for lung cancer risk prediction (malignant-benign classification). To evaluate our model's general performance, we selected 253 out of 467 patients from a subset of the National Lung Screening Trial (NLST) who had CT scans without contrast, which are the most commonly used, and divided them into training and test cohorts. The CT scans were preprocessed into 2D-image and 3D-volume formats according to their nodule annotations. We evaluated ten 3D and eleven 2D SOTA deep learning models, which were pretrained on large-scale general-purpose datasets (Kinetics and ImageNet) and radiological datasets (3DSeg-8, nnUnet and RadImageNet), for their lung cancer risk prediction performance. Our results showed that 3D-based deep learning models generally perform better than 2D models. On the test cohort, the best-performing 3D model achieved an AUROC of 0.86, while the best 2D model reached 0.79. The lowest AUROCs for the 3D and 2D models were 0.70 and 0.62, respectively. Furthermore, pretraining on large-scale radiological image datasets did not show the expected performance advantage over pretraining on general-purpose datasets. Both 2D and 3D deep learning models can handle lung cancer risk prediction tasks effectively, although 3D models generally have superior performance than their 2D competitors. Our findings highlight the importance of carefully selecting pretrained datasets and model architectures for lung cancer risk prediction. Overall, these results have important implications for the development and clinical integration of DL-based tools in lung cancer screening.\",\"PeriodicalId\":21811,\"journal\":{\"name\":\"Scientific Reports\",\"volume\":\"15 1\",\"pages\":\"1736\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-01-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11724919/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Reports\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41598-024-84193-7\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Reports","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41598-024-84193-7","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

深度学习（DL）方法在使用计算机断层扫描（CT）辅助肺癌风险预测任务方面显示出显着的有效性。然而，在实际环境中缺乏对最先进（SOTA）模型的全面比较和验证限制了它们的临床应用。本研究旨在回顾和分析目前用于肺癌风险预测（恶性-良性分类）的SOTA深度学习模型。为了评估我们的模型的总体性能，我们从国家肺筛查试验（NLST）的一个子集中选择了467名患者中的253名，这些患者进行了最常用的CT扫描而不进行对比，并将他们分为训练和测试队列。根据CT扫描的结节注释，将其预处理为2d图像和3d体积格式。我们评估了10个3D和11个2D SOTA深度学习模型，这些模型在大规模通用数据集（Kinetics和ImageNet）和放射学数据集（3DSeg-8， nnUnet和RadImageNet）上进行了预训练，用于肺癌风险预测性能。我们的研究结果表明，基于3d的深度学习模型通常比2D模型表现得更好。在测试队列中，表现最好的3D模型AUROC为0.86，而表现最好的2D模型AUROC为0.79。3D和2D模型的最低auroc分别为0.70和0.62。此外，与通用数据集的预训练相比，大规模放射图像数据集的预训练没有显示出预期的性能优势。2D和3D深度学习模型都可以有效地处理肺癌风险预测任务，尽管3D模型通常比2D模型表现更好。我们的研究结果强调了仔细选择预训练数据集和模型架构对肺癌风险预测的重要性。总的来说，这些结果对基于dl的肺癌筛查工具的开发和临床整合具有重要意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

A benchmark of deep learning approaches to predict lung cancer risk using national lung screening trial cohort.

查看原文本刊更多论文

A benchmark of deep learning approaches to predict lung cancer risk using national lung screening trial cohort.

Deep learning (DL) methods have demonstrated remarkable effectiveness in assisting with lung cancer risk prediction tasks using computed tomography (CT) scans. However, the lack of comprehensive comparison and validation of state-of-the-art (SOTA) models in practical settings limits their clinical application. This study aims to review and analyze current SOTA deep learning models for lung cancer risk prediction (malignant-benign classification). To evaluate our model's general performance, we selected 253 out of 467 patients from a subset of the National Lung Screening Trial (NLST) who had CT scans without contrast, which are the most commonly used, and divided them into training and test cohorts. The CT scans were preprocessed into 2D-image and 3D-volume formats according to their nodule annotations. We evaluated ten 3D and eleven 2D SOTA deep learning models, which were pretrained on large-scale general-purpose datasets (Kinetics and ImageNet) and radiological datasets (3DSeg-8, nnUnet and RadImageNet), for their lung cancer risk prediction performance. Our results showed that 3D-based deep learning models generally perform better than 2D models. On the test cohort, the best-performing 3D model achieved an AUROC of 0.86, while the best 2D model reached 0.79. The lowest AUROCs for the 3D and 2D models were 0.70 and 0.62, respectively. Furthermore, pretraining on large-scale radiological image datasets did not show the expected performance advantage over pretraining on general-purpose datasets. Both 2D and 3D deep learning models can handle lung cancer risk prediction tasks effectively, although 3D models generally have superior performance than their 2D competitors. Our findings highlight the importance of carefully selecting pretrained datasets and model architectures for lung cancer risk prediction. Overall, these results have important implications for the development and clinical integration of DL-based tools in lung cancer screening.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Scientific Reports Natural Science Disciplines-

CiteScore

7.50

自引率

4.30%

发文量

19567

审稿时长

3.9 months

期刊介绍： We publish original research from all areas of the natural sciences, psychology, medicine and engineering. You can learn more about what we publish by browsing our specific scientific subject areas below or explore Scientific Reports by browsing all articles and collections. Scientific Reports has a 2-year impact factor: 4.380 (2021), and is the 6th most-cited journal in the world, with more than 540,000 citations in 2020 (Clarivate Analytics, 2021). •Engineering Engineering covers all aspects of engineering, technology, and applied science. It plays a crucial role in the development of technologies to address some of the world''s biggest challenges, helping to save lives and improve the way we live. •Physical sciences Physical sciences are those academic disciplines that aim to uncover the underlying laws of nature — often written in the language of mathematics. It is a collective term for areas of study including astronomy, chemistry, materials science and physics. •Earth and environmental sciences Earth and environmental sciences cover all aspects of Earth and planetary science and broadly encompass solid Earth processes, surface and atmospheric dynamics, Earth system history, climate and climate change, marine and freshwater systems, and ecology. It also considers the interactions between humans and these systems. •Biological sciences Biological sciences encompass all the divisions of natural sciences examining various aspects of vital processes. The concept includes anatomy, physiology, cell biology, biochemistry and biophysics, and covers all organisms from microorganisms, animals to plants. •Health sciences The health sciences study health, disease and healthcare. This field of study aims to develop knowledge, interventions and technology for use in healthcare to improve the treatment of patients.