Noa Antonissen, Kiran Vaidhya Venkadesh, Renate Dinnessen, Ernst Th Scholten, Zaigham Saghir, Mario Silva, Ugo Pastorino, Grigory Sidorenkov, Marjolein A Heuvelmans, Geertruida H de Bock, Firdaus A A Mohamed Hoesein, Pim A de Jong, Harry J M Groen, Rozemarijn Vliegenthart, Hester A Gietema, Mathias Prokop, Cornelia Schaefer-Prokop, Colin Jacobs
求助PDF
{"title":"使用欧洲筛查数据进行肺结节恶性肿瘤风险分层的深度学习算法的外部测试。","authors":"Noa Antonissen, Kiran Vaidhya Venkadesh, Renate Dinnessen, Ernst Th Scholten, Zaigham Saghir, Mario Silva, Ugo Pastorino, Grigory Sidorenkov, Marjolein A Heuvelmans, Geertruida H de Bock, Firdaus A A Mohamed Hoesein, Pim A de Jong, Harry J M Groen, Rozemarijn Vliegenthart, Hester A Gietema, Mathias Prokop, Cornelia Schaefer-Prokop, Colin Jacobs","doi":"10.1148/radiol.250874","DOIUrl":null,"url":null,"abstract":"<p><p>Background Low-dose CT screening reduces lung cancer-related deaths but has high rates of false-positive findings. A deep learning (DL) algorithm could improve nodule risk stratification but requires robust external testing. Purpose To externally test a DL algorithm for nodule malignancy risk estimation using pooled data from three large European lung cancer screening trials. Materials and Methods In this retrospective study, a DL algorithm trained on National Lung Screening Trial data was externally tested using baseline CT scans from the Danish Lung Cancer Screening Trial, the Multicentric Italian Lung Detection trial, and the Dutch-Belgian Lung Cancer Screening Trial. Performance was assessed across the pooled cohort and two subsets: subset A, including indeterminate nodules (5-15 mm); and subset B, including cancers size-matched to benign nodules (1:2 ratio). Performance, including the area under the receiver operating characteristic curve (AUC), was compared with the Pan-Canadian Early Detection of Lung Cancer (PanCan) model. Results The pooled cohort included 4146 participants (median age, 58 years; 78% male participants; median smoking history, 38 pack-years) with 7614 benign and 180 malignant nodules. The DL algorithm achieved AUCs of 0.98, 0.96, and 0.94 for cancers diagnosed within 1 year, 2 years, and throughout screening, respectively, compared with 0.98, 0.94, and 0.93 (<i>P</i> = .19, .02, and .46, respectively) for the PanCan model. In subset A (129 malignant and 2086 benign nodules), DL significantly outperformed PanCan across the same cancer diagnosis timeframes (respective AUCs: 0.95, 0.94, and 0.90 vs 0.91, 0.88, and 0.86; all <i>P</i> < .05). At 100% sensitivity for cancers diagnosed within 1 year, DL classified 68.1% of benign cases as low risk versus 47.4% for the PanCan model, a 39.4% relative reduction in false-positive findings. In subset B (180 malignant and 360 benign nodules), the AUC of the DL algorithm versus the PanCan model was 0.79 versus 0.60 (<i>P</i> < .01), respectively. Conclusion The DL algorithm outperformed the PanCan model across multiple European screening datasets, demonstrating superior malignancy prediction while substantially reducing false-positive classifications for indeterminate nodules. © RSNA, 2025 <i>Supplemental material is available for this article.</i></p>","PeriodicalId":20896,"journal":{"name":"Radiology","volume":"316 3","pages":"e250874"},"PeriodicalIF":15.2000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"External Test of a Deep Learning Algorithm for Pulmonary Nodule Malignancy Risk Stratification Using European Screening Data.\",\"authors\":\"Noa Antonissen, Kiran Vaidhya Venkadesh, Renate Dinnessen, Ernst Th Scholten, Zaigham Saghir, Mario Silva, Ugo Pastorino, Grigory Sidorenkov, Marjolein A Heuvelmans, Geertruida H de Bock, Firdaus A A Mohamed Hoesein, Pim A de Jong, Harry J M Groen, Rozemarijn Vliegenthart, Hester A Gietema, Mathias Prokop, Cornelia Schaefer-Prokop, Colin Jacobs\",\"doi\":\"10.1148/radiol.250874\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Background Low-dose CT screening reduces lung cancer-related deaths but has high rates of false-positive findings. A deep learning (DL) algorithm could improve nodule risk stratification but requires robust external testing. Purpose To externally test a DL algorithm for nodule malignancy risk estimation using pooled data from three large European lung cancer screening trials. Materials and Methods In this retrospective study, a DL algorithm trained on National Lung Screening Trial data was externally tested using baseline CT scans from the Danish Lung Cancer Screening Trial, the Multicentric Italian Lung Detection trial, and the Dutch-Belgian Lung Cancer Screening Trial. Performance was assessed across the pooled cohort and two subsets: subset A, including indeterminate nodules (5-15 mm); and subset B, including cancers size-matched to benign nodules (1:2 ratio). Performance, including the area under the receiver operating characteristic curve (AUC), was compared with the Pan-Canadian Early Detection of Lung Cancer (PanCan) model. Results The pooled cohort included 4146 participants (median age, 58 years; 78% male participants; median smoking history, 38 pack-years) with 7614 benign and 180 malignant nodules. The DL algorithm achieved AUCs of 0.98, 0.96, and 0.94 for cancers diagnosed within 1 year, 2 years, and throughout screening, respectively, compared with 0.98, 0.94, and 0.93 (<i>P</i> = .19, .02, and .46, respectively) for the PanCan model. In subset A (129 malignant and 2086 benign nodules), DL significantly outperformed PanCan across the same cancer diagnosis timeframes (respective AUCs: 0.95, 0.94, and 0.90 vs 0.91, 0.88, and 0.86; all <i>P</i> < .05). At 100% sensitivity for cancers diagnosed within 1 year, DL classified 68.1% of benign cases as low risk versus 47.4% for the PanCan model, a 39.4% relative reduction in false-positive findings. In subset B (180 malignant and 360 benign nodules), the AUC of the DL algorithm versus the PanCan model was 0.79 versus 0.60 (<i>P</i> < .01), respectively. Conclusion The DL algorithm outperformed the PanCan model across multiple European screening datasets, demonstrating superior malignancy prediction while substantially reducing false-positive classifications for indeterminate nodules. © RSNA, 2025 <i>Supplemental material is available for this article.</i></p>\",\"PeriodicalId\":20896,\"journal\":{\"name\":\"Radiology\",\"volume\":\"316 3\",\"pages\":\"e250874\"},\"PeriodicalIF\":15.2000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Radiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1148/radiol.250874\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1148/radiol.250874","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
引用
批量引用