人工智能增强超声诊断滤泡型甲状腺肿瘤:一项多中心回顾性研究。

IF 10 1区 医学 Q1 MEDICINE, GENERAL & INTERNAL
EClinicalMedicine Pub Date : 2025-07-14 eCollection Date: 2025-08-01 DOI:10.1016/j.eclinm.2025.103351
Hui Shen, Shufang Pei, Yue Huang, Suqing Wu, Chifa Zhang, Ting Liang, Dan Yang, Xiaoxiao Feng, Shuyi Liu, Yu Wang, Weihan Cao, Ying Cheng, Hongyan Chen, Qiujie Ni, Fei Wang, Jingjing You, Zhe Jin, Wenle He, Jie Sun, Dexing Yang, Lijuan Liu, Boling Cao, Xiao Zhang, Yingjia Li, Shuixing Zhang, Bin Zhang
{"title":"人工智能增强超声诊断滤泡型甲状腺肿瘤:一项多中心回顾性研究。","authors":"Hui Shen, Shufang Pei, Yue Huang, Suqing Wu, Chifa Zhang, Ting Liang, Dan Yang, Xiaoxiao Feng, Shuyi Liu, Yu Wang, Weihan Cao, Ying Cheng, Hongyan Chen, Qiujie Ni, Fei Wang, Jingjing You, Zhe Jin, Wenle He, Jie Sun, Dexing Yang, Lijuan Liu, Boling Cao, Xiao Zhang, Yingjia Li, Shuixing Zhang, Bin Zhang","doi":"10.1016/j.eclinm.2025.103351","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Conventional diagnostic tools, including ultrasound, fine-needle aspiration cytology, and intraoperative frozen section pathology, may fail to reliably distinguish between benign and malignant follicular-patterned thyroid neoplasms (FNs), leading to unnecessary or inadequate surgical interventions. We aimed to develop and validate a deep learning (DL) system for the preoperative diagnosis of FNs using routine ultrasound images, with the goal of improving diagnostic accuracy and reducing unnecessary procedures.</p><p><strong>Methods: </strong>In this multicenter, retrospective study, we included 3817 patients (2877 [75.4%] female) with a definitive diagnosis of FNs from 11 centers across China. All patients underwent preoperative ultrasound examinations. The dataset comprised 9393 ultrasound images, including thyroid follicular adenoma (n = 1787, 4317 images), follicular carcinoma (n = 446, 1593 images), and follicular variant of papillary thyroid carcinoma (n = 1584, 3483 images) collected between 2012 and 2025. A state-of-the-art OverLoCK (Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels) model was developed on a dataset comprising 2728 patients (6625 images) and validated on an internal cohort (n = 683, 1905 images) and an external cohort (n = 406, 863 images). Model performance was evaluated using the area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1 score. Model calibration was evaluated using calibration curves, while clinical usefulness was assessed through decision curve analysis (DCA).</p><p><strong>Findings: </strong>The OverLoCK model exhibited excellent performance in both the internal and external validation sets. In the internal validation cohort, the OverLoCK model achieved an AUC of 0.937 (95% confidence interval [CI]: 0.919-0.954), with accuracy of 90.9% (95% CI: 87.7-92.0), sensitivity of 93.9% (95% CI: 91.5-95.6), specificity of 84.8% (95% CI: 82.6-86.0), PPV of 92.7% (95% CI: 90.7-93.8), NPV of 87.2% (95% CI: 86.0-91.0), and F1 score of 0.911 (95% CI: 0.887-0.932). In the external validation cohort, the model yielded an AUC of 0.853 (95% CI: 0.832-0.876), accuracy of 82.8% (95% CI: 81.7-84.4), sensitivity of 84.5% (95% CI: 82.5-86.2), specificity of 81.1% (95% CI: 79.2-84.5), PPV of 80.4% (95% CI: 79.0-84.0), NPV of 85.1% (95% CI: 83.2-87.7), and F1 score of 0.839 (95% CI: 0.802-0.877). The DL model demonstrates good agreement between the predicted and actual probabilities of malignancy. DCA confirmed that the model was clinically useful.</p><p><strong>Interpretation: </strong>Our study demonstrates that a DL-based system can provide a noninvasive, accurate, and reliable tool for the preoperative diagnosis of FNs. By improving diagnostic precision, this approach has the potential to optimize clinical decision-making and reduce the burden of overtreatment in patients with FNs. Further prospective studies are warranted to validate these findings in real-world clinical settings.</p><p><strong>Funding: </strong>This work was supported by the National Key Research and Development Program of China (2023YFF1204600), the National Natural Science Foundation of China (82227802 and 82302190), the Clinical Frontier Technology Program of the First Affiliated Hospital of Jinan University (No. JNU1AF-CFTP-2022-a01201), the Science and Technology Projects in Guangzhou (202201020022, 2023A03J1036, 2023A03J1038, 2025A04J7006), the Outstanding Young Talents of Guangdong Special Support Program (Health Commission of Guangdong Province) (0720240213), and the Science and Technology Youth Talent Nurturing Program of Jinan University (21623209).</p>","PeriodicalId":11393,"journal":{"name":"EClinicalMedicine","volume":"86 ","pages":"103351"},"PeriodicalIF":10.0000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12281033/pdf/","citationCount":"0","resultStr":"{\"title\":\"Artificial intelligence-augmented ultrasound diagnosis of follicular-patterned thyroid neoplasms: a multicenter retrospective study.\",\"authors\":\"Hui Shen, Shufang Pei, Yue Huang, Suqing Wu, Chifa Zhang, Ting Liang, Dan Yang, Xiaoxiao Feng, Shuyi Liu, Yu Wang, Weihan Cao, Ying Cheng, Hongyan Chen, Qiujie Ni, Fei Wang, Jingjing You, Zhe Jin, Wenle He, Jie Sun, Dexing Yang, Lijuan Liu, Boling Cao, Xiao Zhang, Yingjia Li, Shuixing Zhang, Bin Zhang\",\"doi\":\"10.1016/j.eclinm.2025.103351\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Conventional diagnostic tools, including ultrasound, fine-needle aspiration cytology, and intraoperative frozen section pathology, may fail to reliably distinguish between benign and malignant follicular-patterned thyroid neoplasms (FNs), leading to unnecessary or inadequate surgical interventions. We aimed to develop and validate a deep learning (DL) system for the preoperative diagnosis of FNs using routine ultrasound images, with the goal of improving diagnostic accuracy and reducing unnecessary procedures.</p><p><strong>Methods: </strong>In this multicenter, retrospective study, we included 3817 patients (2877 [75.4%] female) with a definitive diagnosis of FNs from 11 centers across China. All patients underwent preoperative ultrasound examinations. The dataset comprised 9393 ultrasound images, including thyroid follicular adenoma (n = 1787, 4317 images), follicular carcinoma (n = 446, 1593 images), and follicular variant of papillary thyroid carcinoma (n = 1584, 3483 images) collected between 2012 and 2025. A state-of-the-art OverLoCK (Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels) model was developed on a dataset comprising 2728 patients (6625 images) and validated on an internal cohort (n = 683, 1905 images) and an external cohort (n = 406, 863 images). Model performance was evaluated using the area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1 score. Model calibration was evaluated using calibration curves, while clinical usefulness was assessed through decision curve analysis (DCA).</p><p><strong>Findings: </strong>The OverLoCK model exhibited excellent performance in both the internal and external validation sets. In the internal validation cohort, the OverLoCK model achieved an AUC of 0.937 (95% confidence interval [CI]: 0.919-0.954), with accuracy of 90.9% (95% CI: 87.7-92.0), sensitivity of 93.9% (95% CI: 91.5-95.6), specificity of 84.8% (95% CI: 82.6-86.0), PPV of 92.7% (95% CI: 90.7-93.8), NPV of 87.2% (95% CI: 86.0-91.0), and F1 score of 0.911 (95% CI: 0.887-0.932). In the external validation cohort, the model yielded an AUC of 0.853 (95% CI: 0.832-0.876), accuracy of 82.8% (95% CI: 81.7-84.4), sensitivity of 84.5% (95% CI: 82.5-86.2), specificity of 81.1% (95% CI: 79.2-84.5), PPV of 80.4% (95% CI: 79.0-84.0), NPV of 85.1% (95% CI: 83.2-87.7), and F1 score of 0.839 (95% CI: 0.802-0.877). The DL model demonstrates good agreement between the predicted and actual probabilities of malignancy. DCA confirmed that the model was clinically useful.</p><p><strong>Interpretation: </strong>Our study demonstrates that a DL-based system can provide a noninvasive, accurate, and reliable tool for the preoperative diagnosis of FNs. By improving diagnostic precision, this approach has the potential to optimize clinical decision-making and reduce the burden of overtreatment in patients with FNs. Further prospective studies are warranted to validate these findings in real-world clinical settings.</p><p><strong>Funding: </strong>This work was supported by the National Key Research and Development Program of China (2023YFF1204600), the National Natural Science Foundation of China (82227802 and 82302190), the Clinical Frontier Technology Program of the First Affiliated Hospital of Jinan University (No. JNU1AF-CFTP-2022-a01201), the Science and Technology Projects in Guangzhou (202201020022, 2023A03J1036, 2023A03J1038, 2025A04J7006), the Outstanding Young Talents of Guangdong Special Support Program (Health Commission of Guangdong Province) (0720240213), and the Science and Technology Youth Talent Nurturing Program of Jinan University (21623209).</p>\",\"PeriodicalId\":11393,\"journal\":{\"name\":\"EClinicalMedicine\",\"volume\":\"86 \",\"pages\":\"103351\"},\"PeriodicalIF\":10.0000,\"publicationDate\":\"2025-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12281033/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"EClinicalMedicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1016/j.eclinm.2025.103351\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/8/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"EClinicalMedicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.eclinm.2025.103351","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/8/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

摘要

背景:传统的诊断工具,包括超声、细针穿刺细胞学和术中冷冻切片病理,可能无法可靠地区分良性和恶性FNs,导致不必要或不充分的手术干预。我们旨在开发和验证一种深度学习(DL)系统,用于使用常规超声图像对滤泡型甲状腺肿瘤(FNs)进行术前诊断,目的是提高诊断准确性并减少不必要的程序。方法:在这项多中心回顾性研究中,我们纳入了来自中国11个中心的3817例确诊为FNs的患者(2877例[75.4%]女性)。所有患者术前均行超声检查。该数据集包括2012年至2025年间收集的9393张超声图像,包括甲状腺滤泡性腺瘤(n = 1787, 4317张)、滤泡性癌(n = 446, 1593张)和甲状腺乳头状癌滤泡变异型(n = 1584, 3483张)。在包含2728名患者(6625张图像)的数据集上开发了最先进的OverLoCK(概述-first- lookclose -next ConvNet with Context-Mixing Dynamic kernel)模型,并在内部队列(n = 683, 1905张图像)和外部队列(n = 406, 863张图像)上进行了验证。采用曲线下面积(AUC)、准确性、敏感性、特异性、阳性预测值(PPV)、阴性预测值(NPV)和F1评分来评价模型的性能。采用校准曲线评估模型的校准,通过决策曲线分析(DCA)评估模型的临床有效性。结果:OverLoCK模型在内部和外部验证集中都表现出优异的性能。在内部验证队列中,OverLoCK模型的AUC为0.937(95%可信区间[CI]: 0.919-0.954),准确率为90.9% (95% CI: 87.7-92.0),灵敏度为93.9% (95% CI: 91.5-95.6),特异性为84.8% (95% CI: 82.6-86.0), PPV为92.7% (95% CI: 90.7-93.8), NPV为87.2% (95% CI: 86.0-91.0), F1评分为0.911% (95% CI: 0.887-0.932)。在外部验证队列中,该模型的AUC为0.853 (95% CI: 0.832-0.876),准确度为82.8% (95% CI: 81.7-84.4),灵敏度为84.5% (95% CI: 82.5-86.2),特异性为81.1% (95% CI: 79.2-84.5), PPV为80.4% (95% CI: 79.0-84.0), NPV为85.1% (95% CI: 83.2-87.7), F1评分为0.839 (95% CI: 0.802-0.877)。DL模型在恶性肿瘤的预测概率和实际概率之间表现出良好的一致性。DCA证实该模型具有临床应用价值。解释:我们的研究表明,基于dl的系统可以为FNs的术前诊断提供无创、准确和可靠的工具。通过提高诊断精度,该方法有可能优化临床决策,减轻FNs患者过度治疗的负担。进一步的前瞻性研究有必要在现实世界的临床环境中验证这些发现。基金资助:国家重点研发计划项目(2023YFF1204600)、国家自然科学基金项目(82227802和82302190)、暨南大学第一附属医院临床前沿技术计划项目(No. 8302190)资助;JNU1AF-CFTP-2022-a01201)、广州市科技专项(202201020022、2023A03J1036、2023A03J1038、2025A04J7006)、广东省卫生健康委杰出青年人才专项支持计划(0720240213)、暨南大学科技青年人才培养计划(21623209)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Artificial intelligence-augmented ultrasound diagnosis of follicular-patterned thyroid neoplasms: a multicenter retrospective study.

Background: Conventional diagnostic tools, including ultrasound, fine-needle aspiration cytology, and intraoperative frozen section pathology, may fail to reliably distinguish between benign and malignant follicular-patterned thyroid neoplasms (FNs), leading to unnecessary or inadequate surgical interventions. We aimed to develop and validate a deep learning (DL) system for the preoperative diagnosis of FNs using routine ultrasound images, with the goal of improving diagnostic accuracy and reducing unnecessary procedures.

Methods: In this multicenter, retrospective study, we included 3817 patients (2877 [75.4%] female) with a definitive diagnosis of FNs from 11 centers across China. All patients underwent preoperative ultrasound examinations. The dataset comprised 9393 ultrasound images, including thyroid follicular adenoma (n = 1787, 4317 images), follicular carcinoma (n = 446, 1593 images), and follicular variant of papillary thyroid carcinoma (n = 1584, 3483 images) collected between 2012 and 2025. A state-of-the-art OverLoCK (Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels) model was developed on a dataset comprising 2728 patients (6625 images) and validated on an internal cohort (n = 683, 1905 images) and an external cohort (n = 406, 863 images). Model performance was evaluated using the area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1 score. Model calibration was evaluated using calibration curves, while clinical usefulness was assessed through decision curve analysis (DCA).

Findings: The OverLoCK model exhibited excellent performance in both the internal and external validation sets. In the internal validation cohort, the OverLoCK model achieved an AUC of 0.937 (95% confidence interval [CI]: 0.919-0.954), with accuracy of 90.9% (95% CI: 87.7-92.0), sensitivity of 93.9% (95% CI: 91.5-95.6), specificity of 84.8% (95% CI: 82.6-86.0), PPV of 92.7% (95% CI: 90.7-93.8), NPV of 87.2% (95% CI: 86.0-91.0), and F1 score of 0.911 (95% CI: 0.887-0.932). In the external validation cohort, the model yielded an AUC of 0.853 (95% CI: 0.832-0.876), accuracy of 82.8% (95% CI: 81.7-84.4), sensitivity of 84.5% (95% CI: 82.5-86.2), specificity of 81.1% (95% CI: 79.2-84.5), PPV of 80.4% (95% CI: 79.0-84.0), NPV of 85.1% (95% CI: 83.2-87.7), and F1 score of 0.839 (95% CI: 0.802-0.877). The DL model demonstrates good agreement between the predicted and actual probabilities of malignancy. DCA confirmed that the model was clinically useful.

Interpretation: Our study demonstrates that a DL-based system can provide a noninvasive, accurate, and reliable tool for the preoperative diagnosis of FNs. By improving diagnostic precision, this approach has the potential to optimize clinical decision-making and reduce the burden of overtreatment in patients with FNs. Further prospective studies are warranted to validate these findings in real-world clinical settings.

Funding: This work was supported by the National Key Research and Development Program of China (2023YFF1204600), the National Natural Science Foundation of China (82227802 and 82302190), the Clinical Frontier Technology Program of the First Affiliated Hospital of Jinan University (No. JNU1AF-CFTP-2022-a01201), the Science and Technology Projects in Guangzhou (202201020022, 2023A03J1036, 2023A03J1038, 2025A04J7006), the Outstanding Young Talents of Guangdong Special Support Program (Health Commission of Guangdong Province) (0720240213), and the Science and Technology Youth Talent Nurturing Program of Jinan University (21623209).

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
EClinicalMedicine
EClinicalMedicine Medicine-Medicine (all)
CiteScore
18.90
自引率
1.30%
发文量
506
审稿时长
22 days
期刊介绍: eClinicalMedicine is a gold open-access clinical journal designed to support frontline health professionals in addressing the complex and rapid health transitions affecting societies globally. The journal aims to assist practitioners in overcoming healthcare challenges across diverse communities, spanning diagnosis, treatment, prevention, and health promotion. Integrating disciplines from various specialties and life stages, it seeks to enhance health systems as fundamental institutions within societies. With a forward-thinking approach, eClinicalMedicine aims to redefine the future of healthcare.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信