基于机器学习辅助诊断肺结节良恶性的回顾性研究。

IF 2.4 3区 医学 Q2 HEALTH CARE SCIENCES & SERVICES
Journal of Multidisciplinary Healthcare Pub Date : 2025-06-27 eCollection Date: 2025-01-01 DOI:10.2147/JMDH.S518166
Wanling Wang, Bingqing Yang, Huan Wu, Hebin Che, Yue Tong, Bozun Zhang, Hongwu Liu, Yuanyuan Chen
{"title":"基于机器学习辅助诊断肺结节良恶性的回顾性研究。","authors":"Wanling Wang, Bingqing Yang, Huan Wu, Hebin Che, Yue Tong, Bozun Zhang, Hongwu Liu, Yuanyuan Chen","doi":"10.2147/JMDH.S518166","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Lung cancer, one of the most lethal malignancies globally, often presents insidiously as pulmonary nodules. Its nonspecific clinical presentation and heterogeneous imaging characteristics hinder accurate differentiation between benign and malignant lesions, while biopsy's invasiveness and procedural constraints underscore the critical need for non-invasive early diagnostic approaches.</p><p><strong>Methods: </strong>In this retrospective study, we analyzed outpatient and inpatient records from the First Medical Center of Chinese PLA General Hospital between 2011 and 2021, focusing on pulmonary nodules measuring 5-30mm on CT scans without overt signs of malignancy. Pathological examination served as the reference standard. Comparative experiments evaluated SVM, RF, XGBoost, FNN, and Atten_FNN using five-fold cross-validation to assess AUC, sensitivity, and specificity. The dataset was split 70%/30%, and stratified five-fold cross-validation was applied to the training set. The optimal model was interpreted with SHAP to identify the most influential predictive features.</p><p><strong>Results: </strong>This study enrolled 3355 patients, including 1156 with benign and 2199 with malignant pulmonary nodules. The Atten_FNN model demonstrated superior performance in five-fold cross-validation, achieving an AUC of 0.82, accuracy of 0.75, sensitivity of 0.77, and F1 score of 0.80. SHAP analysis revealed key predictive factors: demographic variables (age, sex, BMI), CT-derived features (maximum nodule diameter, morphology, density, calcification, ground-glass opacity), and laboratory biomarkers (neuroendocrine markers, carcinoembryonic antigen).</p><p><strong>Conclusion: </strong>This study integrates electronic medical records and pathology data to predict pulmonary nodule malignancy using machine/deep learning models. SHAP-based interpretability analysis uncovered key clinical determinants. Acknowledging limitations in cross-center generalizability, we propose the development of a multimodal diagnostic systems that combines CT imaging and radiomics, to be validated in multi-center prospective cohorts to facilitate clinical translation. This framework establishes a novel paradigm for early precision diagnosis of lung cancer.</p>","PeriodicalId":16357,"journal":{"name":"Journal of Multidisciplinary Healthcare","volume":"18 ","pages":"3735-3748"},"PeriodicalIF":2.4000,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12212436/pdf/","citationCount":"0","resultStr":"{\"title\":\"Auxiliary Diagnosis of Pulmonary Nodules' Benignancy and Malignancy Based on Machine Learning: A Retrospective Study.\",\"authors\":\"Wanling Wang, Bingqing Yang, Huan Wu, Hebin Che, Yue Tong, Bozun Zhang, Hongwu Liu, Yuanyuan Chen\",\"doi\":\"10.2147/JMDH.S518166\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Lung cancer, one of the most lethal malignancies globally, often presents insidiously as pulmonary nodules. Its nonspecific clinical presentation and heterogeneous imaging characteristics hinder accurate differentiation between benign and malignant lesions, while biopsy's invasiveness and procedural constraints underscore the critical need for non-invasive early diagnostic approaches.</p><p><strong>Methods: </strong>In this retrospective study, we analyzed outpatient and inpatient records from the First Medical Center of Chinese PLA General Hospital between 2011 and 2021, focusing on pulmonary nodules measuring 5-30mm on CT scans without overt signs of malignancy. Pathological examination served as the reference standard. Comparative experiments evaluated SVM, RF, XGBoost, FNN, and Atten_FNN using five-fold cross-validation to assess AUC, sensitivity, and specificity. The dataset was split 70%/30%, and stratified five-fold cross-validation was applied to the training set. The optimal model was interpreted with SHAP to identify the most influential predictive features.</p><p><strong>Results: </strong>This study enrolled 3355 patients, including 1156 with benign and 2199 with malignant pulmonary nodules. The Atten_FNN model demonstrated superior performance in five-fold cross-validation, achieving an AUC of 0.82, accuracy of 0.75, sensitivity of 0.77, and F1 score of 0.80. SHAP analysis revealed key predictive factors: demographic variables (age, sex, BMI), CT-derived features (maximum nodule diameter, morphology, density, calcification, ground-glass opacity), and laboratory biomarkers (neuroendocrine markers, carcinoembryonic antigen).</p><p><strong>Conclusion: </strong>This study integrates electronic medical records and pathology data to predict pulmonary nodule malignancy using machine/deep learning models. SHAP-based interpretability analysis uncovered key clinical determinants. Acknowledging limitations in cross-center generalizability, we propose the development of a multimodal diagnostic systems that combines CT imaging and radiomics, to be validated in multi-center prospective cohorts to facilitate clinical translation. This framework establishes a novel paradigm for early precision diagnosis of lung cancer.</p>\",\"PeriodicalId\":16357,\"journal\":{\"name\":\"Journal of Multidisciplinary Healthcare\",\"volume\":\"18 \",\"pages\":\"3735-3748\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12212436/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Multidisciplinary Healthcare\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.2147/JMDH.S518166\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Multidisciplinary Healthcare","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2147/JMDH.S518166","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

摘要

背景:肺癌是全球最致命的恶性肿瘤之一,通常表现为肺结节。其非特异性临床表现和异质性影像学特征阻碍了良性和恶性病变的准确区分,而活检的侵袭性和程序限制强调了对非侵入性早期诊断方法的迫切需要。方法:在这项回顾性研究中,我们分析了中国人民解放军总医院第一医疗中心2011年至2021年的门诊和住院记录,重点分析了CT扫描上5-30mm的肺结节,没有明显的恶性肿瘤征象。以病理检查为参考标准。对比实验评估SVM、RF、XGBoost、FNN和Atten_FNN,使用五重交叉验证来评估AUC、灵敏度和特异性。将数据集分割成70%/30%,对训练集进行分层五重交叉验证。用SHAP对最优模型进行解释,以确定最具影响力的预测特征。结果:本研究共入组3355例患者,其中良性肺结节1156例,恶性肺结节2199例。Atten_FNN模型在五重交叉验证中表现出优异的性能,AUC为0.82,准确率为0.75,灵敏度为0.77,F1评分为0.80。SHAP分析揭示了关键的预测因素:人口统计学变量(年龄、性别、BMI)、ct衍生特征(最大结节直径、形态、密度、钙化、磨玻璃浊)和实验室生物标志物(神经内分泌标志物、癌胚抗原)。结论:本研究结合电子病历和病理数据,利用机器/深度学习模型预测肺结节恶性肿瘤。基于shap的可解释性分析揭示了关键的临床决定因素。考虑到跨中心推广的局限性,我们建议开发一种结合CT成像和放射组学的多模式诊断系统,在多中心前瞻性队列中进行验证,以促进临床翻译。该框架为肺癌的早期精确诊断建立了一个新的范式。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Auxiliary Diagnosis of Pulmonary Nodules' Benignancy and Malignancy Based on Machine Learning: A Retrospective Study.

Background: Lung cancer, one of the most lethal malignancies globally, often presents insidiously as pulmonary nodules. Its nonspecific clinical presentation and heterogeneous imaging characteristics hinder accurate differentiation between benign and malignant lesions, while biopsy's invasiveness and procedural constraints underscore the critical need for non-invasive early diagnostic approaches.

Methods: In this retrospective study, we analyzed outpatient and inpatient records from the First Medical Center of Chinese PLA General Hospital between 2011 and 2021, focusing on pulmonary nodules measuring 5-30mm on CT scans without overt signs of malignancy. Pathological examination served as the reference standard. Comparative experiments evaluated SVM, RF, XGBoost, FNN, and Atten_FNN using five-fold cross-validation to assess AUC, sensitivity, and specificity. The dataset was split 70%/30%, and stratified five-fold cross-validation was applied to the training set. The optimal model was interpreted with SHAP to identify the most influential predictive features.

Results: This study enrolled 3355 patients, including 1156 with benign and 2199 with malignant pulmonary nodules. The Atten_FNN model demonstrated superior performance in five-fold cross-validation, achieving an AUC of 0.82, accuracy of 0.75, sensitivity of 0.77, and F1 score of 0.80. SHAP analysis revealed key predictive factors: demographic variables (age, sex, BMI), CT-derived features (maximum nodule diameter, morphology, density, calcification, ground-glass opacity), and laboratory biomarkers (neuroendocrine markers, carcinoembryonic antigen).

Conclusion: This study integrates electronic medical records and pathology data to predict pulmonary nodule malignancy using machine/deep learning models. SHAP-based interpretability analysis uncovered key clinical determinants. Acknowledging limitations in cross-center generalizability, we propose the development of a multimodal diagnostic systems that combines CT imaging and radiomics, to be validated in multi-center prospective cohorts to facilitate clinical translation. This framework establishes a novel paradigm for early precision diagnosis of lung cancer.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Multidisciplinary Healthcare
Journal of Multidisciplinary Healthcare Nursing-General Nursing
CiteScore
4.60
自引率
3.00%
发文量
287
审稿时长
16 weeks
期刊介绍: The Journal of Multidisciplinary Healthcare (JMDH) aims to represent and publish research in healthcare areas delivered by practitioners of different disciplines. This includes studies and reviews conducted by multidisciplinary teams as well as research which evaluates or reports the results or conduct of such teams or healthcare processes in general. The journal covers a very wide range of areas and we welcome submissions from practitioners at all levels and from all over the world. Good healthcare is not bounded by person, place or time and the journal aims to reflect this. The JMDH is published as an open-access journal to allow this wide range of practical, patient relevant research to be immediately available to practitioners who can access and use it immediately upon publication.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信