Adil Oezsoy, James Alexander Brooks, Marko van Treeck, Yvonne Doerffel, Ulrike Morgera, Jens Berger, Marco Gustav, Oliver Lester Saldanha, Tom Luedde, Jakob Nikolas Kather, Tobias Paul Seraphin, Michael Kallenbach
{"title":"弱监督深度学习可以分析对比增强超声的局灶性肝脏病变。","authors":"Adil Oezsoy, James Alexander Brooks, Marko van Treeck, Yvonne Doerffel, Ulrike Morgera, Jens Berger, Marco Gustav, Oliver Lester Saldanha, Tom Luedde, Jakob Nikolas Kather, Tobias Paul Seraphin, Michael Kallenbach","doi":"10.1159/000545098","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Assessing the malignancy of focal liver lesions (FLLs) is an important yet challenging aspect of routine patient care. Contrast-enhanced ultrasound (CEUS) has proved to be a highly reliable tool but is very dependent on the examiner's expertise. The emergence of artificial intelligence has opened doors to algorithms that could potentially aid in the diagnostic process. In this study, we evaluate the performance of a weakly supervised deep learning model in classifying FLLs as malignant or benign.</p><p><strong>Methods: </strong>Our retrospective feasibility study was based on a cohort of patients from a tertiary care hospital in Germany undergoing routine CEUS examination to evaluate malignancy of FLL. We trained a weakly supervised attention-based multiple instance learning algorithm during 5-fold cross-validation to distinguish malignant from benign liver tumors, without using any manual annotations, only case labels. We aggregated the on-average best performing cross-validation cycle and tested this combined model on a held-out test set. We evaluated its performance using standard performance metrics and developed explainability methods to gain insight into the model's decisions.</p><p><strong>Results: </strong>We enrolled 370 patients, comprising a total of 955,938 images extracted from CEUS videos or manually captured during the examination. Our combined model was able to identify malignant lesions with a mean area under the receiver operating curve of 0.844 in the cross-validation experiment and 0.94 (95% CI: 0.89-0.99) in the held-out test set. The accuracy, sensitivity, specificity, and F1-Score of the combined model in finding malignant lesions in the held-out test, yielded 80.0%, 81.8%, 84.6%, and 0.81, respectively. Our exploratory analysis using visual explainability methods revealed that the model appears to prioritize information that is also highly relevant to expert clinicians in this task.</p><p><strong>Conclusion: </strong>Weakly supervised deep learning can classify malignancy in CEUS examinations of FLLs and thus might one day be able to assist doctors' decision-making in clinical routine.</p>","PeriodicalId":11315,"journal":{"name":"Digestion","volume":" ","pages":"1-13"},"PeriodicalIF":3.6000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Weakly Supervised Deep Learning Can Analyze Focal Liver Lesions in Contrast-Enhanced Ultrasound.\",\"authors\":\"Adil Oezsoy, James Alexander Brooks, Marko van Treeck, Yvonne Doerffel, Ulrike Morgera, Jens Berger, Marco Gustav, Oliver Lester Saldanha, Tom Luedde, Jakob Nikolas Kather, Tobias Paul Seraphin, Michael Kallenbach\",\"doi\":\"10.1159/000545098\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>Assessing the malignancy of focal liver lesions (FLLs) is an important yet challenging aspect of routine patient care. Contrast-enhanced ultrasound (CEUS) has proved to be a highly reliable tool but is very dependent on the examiner's expertise. The emergence of artificial intelligence has opened doors to algorithms that could potentially aid in the diagnostic process. In this study, we evaluate the performance of a weakly supervised deep learning model in classifying FLLs as malignant or benign.</p><p><strong>Methods: </strong>Our retrospective feasibility study was based on a cohort of patients from a tertiary care hospital in Germany undergoing routine CEUS examination to evaluate malignancy of FLL. We trained a weakly supervised attention-based multiple instance learning algorithm during 5-fold cross-validation to distinguish malignant from benign liver tumors, without using any manual annotations, only case labels. We aggregated the on-average best performing cross-validation cycle and tested this combined model on a held-out test set. We evaluated its performance using standard performance metrics and developed explainability methods to gain insight into the model's decisions.</p><p><strong>Results: </strong>We enrolled 370 patients, comprising a total of 955,938 images extracted from CEUS videos or manually captured during the examination. Our combined model was able to identify malignant lesions with a mean area under the receiver operating curve of 0.844 in the cross-validation experiment and 0.94 (95% CI: 0.89-0.99) in the held-out test set. The accuracy, sensitivity, specificity, and F1-Score of the combined model in finding malignant lesions in the held-out test, yielded 80.0%, 81.8%, 84.6%, and 0.81, respectively. Our exploratory analysis using visual explainability methods revealed that the model appears to prioritize information that is also highly relevant to expert clinicians in this task.</p><p><strong>Conclusion: </strong>Weakly supervised deep learning can classify malignancy in CEUS examinations of FLLs and thus might one day be able to assist doctors' decision-making in clinical routine.</p>\",\"PeriodicalId\":11315,\"journal\":{\"name\":\"Digestion\",\"volume\":\" \",\"pages\":\"1-13\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digestion\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1159/000545098\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GASTROENTEROLOGY & HEPATOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digestion","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1159/000545098","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
评估局灶性肝脏病变的恶性程度是常规患者护理的一个重要但具有挑战性的方面。对比增强超声(CEUS)已被证明是一种高度可靠的工具,但非常依赖于检查者的专业知识。人工智能的出现为可能有助于诊断过程的算法打开了大门。在本研究中,我们评估了弱监督深度学习模型在将局灶性肝脏病变(FLL)分类为恶性或良性方面的性能。方法回顾性可行性研究基于德国一家三级医院的患者队列,患者接受常规超声造影检查以评估FLL的恶性程度。在5次交叉验证中,我们训练了一个弱监督的基于注意力的多实例学习算法来区分肝脏肿瘤的恶性和良性,而不使用任何手动注释,仅使用病例标签。我们汇总了平均表现最好的交叉验证周期,并在一个固定测试集上测试了这个组合模型。我们使用标准性能指标评估其性能,并开发了可解释性方法来深入了解模型的决策。结果我们招募了370例患者,包括955,938张从超声造影视频中提取或在检查过程中手动捕获的图像。我们的联合模型在交叉验证实验中能够识别恶性病变,在接受者工作曲线下的平均面积为0.844,在保留测试集中能够识别出0.94 (95% CI 0.89 - 0.99)。联合模型在hold -out试验中发现恶性病变的准确率为80.0%,灵敏度为81.8%,特异性为84.6%,F1-Score为0.81。我们使用视觉可解释性方法的探索性分析显示,该模型似乎优先考虑了与该任务中的专家临床医生高度相关的信息。结论弱监督深度学习可以在超声造影检查中对恶性肿瘤进行分类,从而在临床常规中辅助医生的决策。
Weakly Supervised Deep Learning Can Analyze Focal Liver Lesions in Contrast-Enhanced Ultrasound.
Introduction: Assessing the malignancy of focal liver lesions (FLLs) is an important yet challenging aspect of routine patient care. Contrast-enhanced ultrasound (CEUS) has proved to be a highly reliable tool but is very dependent on the examiner's expertise. The emergence of artificial intelligence has opened doors to algorithms that could potentially aid in the diagnostic process. In this study, we evaluate the performance of a weakly supervised deep learning model in classifying FLLs as malignant or benign.
Methods: Our retrospective feasibility study was based on a cohort of patients from a tertiary care hospital in Germany undergoing routine CEUS examination to evaluate malignancy of FLL. We trained a weakly supervised attention-based multiple instance learning algorithm during 5-fold cross-validation to distinguish malignant from benign liver tumors, without using any manual annotations, only case labels. We aggregated the on-average best performing cross-validation cycle and tested this combined model on a held-out test set. We evaluated its performance using standard performance metrics and developed explainability methods to gain insight into the model's decisions.
Results: We enrolled 370 patients, comprising a total of 955,938 images extracted from CEUS videos or manually captured during the examination. Our combined model was able to identify malignant lesions with a mean area under the receiver operating curve of 0.844 in the cross-validation experiment and 0.94 (95% CI: 0.89-0.99) in the held-out test set. The accuracy, sensitivity, specificity, and F1-Score of the combined model in finding malignant lesions in the held-out test, yielded 80.0%, 81.8%, 84.6%, and 0.81, respectively. Our exploratory analysis using visual explainability methods revealed that the model appears to prioritize information that is also highly relevant to expert clinicians in this task.
Conclusion: Weakly supervised deep learning can classify malignancy in CEUS examinations of FLLs and thus might one day be able to assist doctors' decision-making in clinical routine.
期刊介绍:
''Digestion'' concentrates on clinical research reports: in addition to editorials and reviews, the journal features sections on Stomach/Esophagus, Bowel, Neuro-Gastroenterology, Liver/Bile, Pancreas, Metabolism/Nutrition and Gastrointestinal Oncology. Papers cover physiology in humans, metabolic studies and clinical work on the etiology, diagnosis, and therapy of human diseases. It is thus especially cut out for gastroenterologists employed in hospitals and outpatient units. Moreover, the journal''s coverage of studies on the metabolism and effects of therapeutic drugs carries considerable value for clinicians and investigators beyond the immediate field of gastroenterology.