An Explainable Deep Learning Model for Focal Liver Lesion Diagnosis Using Multiparametric MRI.
IF 13.2
Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Zhehan Shen, Lingzhi Chen, Lilong Wang, Shunjie Dong, Fakai Wang, Yaning Pan, Jiahao Zhou, Yikun Wang, Xinxin Xu, Huanhuan Chong, Huimin Lin, Weixia Li, Ruokun Li, Haihong Ma, Jing Ma, Yixing Yu, Lianjun Du, Xiaosong Wang, Shaoting Zhang, Fuhua Yan
求助PDF
{"title":"An Explainable Deep Learning Model for Focal Liver Lesion Diagnosis Using Multiparametric MRI.","authors":"Zhehan Shen, Lingzhi Chen, Lilong Wang, Shunjie Dong, Fakai Wang, Yaning Pan, Jiahao Zhou, Yikun Wang, Xinxin Xu, Huanhuan Chong, Huimin Lin, Weixia Li, Ruokun Li, Haihong Ma, Jing Ma, Yixing Yu, Lianjun Du, Xiaosong Wang, Shaoting Zhang, Fuhua Yan","doi":"10.1148/ryai.240531","DOIUrl":null,"url":null,"abstract":"<p><p>Purpose To assess the effectiveness of an explainable deep learning (DL) model, developed using multiparametric MRI (mpMRI) features, in improving diagnostic accuracy and efficiency of radiologists for classification of focal liver lesions (FLLs). Materials and Methods FLLs ≥ 1 cm in diameter at mpMRI were included in the study. nn-Unet and Liver Imaging Feature Transformer (LIFT) models were developed using retrospective data from one hospital (January 2018-August 2023). nnU-Net was used for lesion segmentation and LIFT for FLL classification. External testing was performed on data from three hospitals (January 2018-December 2023), with a prospective test set obtained from January 2024 to April 2024. Model performance was compared with radiologists and impact of model assistance on junior and senior radiologist performance was assessed. Evaluation metrics included the Dice similarity coefficient (DSC) and accuracy. Results A total of 2131 individuals with FLLs (mean age, 56 ± [SD] 12 years; 1476 female) were included in the training, internal test, external test, and prospective test sets. Average DSC values for liver and tumor segmentation across the three test sets were 0.98 and 0.96, respectively. Average accuracy for features and lesion classification across the three test sets were 93% and 97%, respectively. LIFT-assisted readings improved diagnostic accuracy (average 5.3% increase, <i>P</i> < .001), reduced reading time (average 34.5 seconds decrease, <i>P</i> < .001), and enhanced confidence (average 0.3-point increase, <i>P</i> < .001) of junior radiologists. Conclusion The proposed DL model accurately detected and classified FLLs, improving diagnostic accuracy and efficiency of junior radiologists. ©RSNA, 2025.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e240531"},"PeriodicalIF":13.2000,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology-Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1148/ryai.240531","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
引用
批量引用
Abstract
Purpose To assess the effectiveness of an explainable deep learning (DL) model, developed using multiparametric MRI (mpMRI) features, in improving diagnostic accuracy and efficiency of radiologists for classification of focal liver lesions (FLLs). Materials and Methods FLLs ≥ 1 cm in diameter at mpMRI were included in the study. nn-Unet and Liver Imaging Feature Transformer (LIFT) models were developed using retrospective data from one hospital (January 2018-August 2023). nnU-Net was used for lesion segmentation and LIFT for FLL classification. External testing was performed on data from three hospitals (January 2018-December 2023), with a prospective test set obtained from January 2024 to April 2024. Model performance was compared with radiologists and impact of model assistance on junior and senior radiologist performance was assessed. Evaluation metrics included the Dice similarity coefficient (DSC) and accuracy. Results A total of 2131 individuals with FLLs (mean age, 56 ± [SD] 12 years; 1476 female) were included in the training, internal test, external test, and prospective test sets. Average DSC values for liver and tumor segmentation across the three test sets were 0.98 and 0.96, respectively. Average accuracy for features and lesion classification across the three test sets were 93% and 97%, respectively. LIFT-assisted readings improved diagnostic accuracy (average 5.3% increase, P < .001), reduced reading time (average 34.5 seconds decrease, P < .001), and enhanced confidence (average 0.3-point increase, P < .001) of junior radiologists. Conclusion The proposed DL model accurately detected and classified FLLs, improving diagnostic accuracy and efficiency of junior radiologists. ©RSNA, 2025.
多参数MRI诊断局灶性肝脏病变的可解释深度学习模型。
“刚刚接受”的论文经过了全面的同行评审,并已被接受发表在《放射学:人工智能》杂志上。这篇文章将经过编辑,布局和校样审查,然后在其最终版本出版。请注意,在最终编辑文章的制作过程中,可能会发现可能影响内容的错误。目的评估利用多参数MRI (mpMRI)特征开发的可解释深度学习(DL)模型在提高放射科医生对局灶性肝病变(fll)分类的诊断准确性和效率方面的有效性。材料与方法纳入mpMRI检查中直径≥1cm的fll。使用一家医院(2018年1月至2023年8月)的回顾性数据开发了nn-Unet和肝脏成像特征转换器(LIFT)模型。nnU-Net用于病灶分割,LIFT用于FLL分类。对三家医院(2018年1月- 2023年12月)的数据进行外部测试,并于2024年1月至2024年4月获得前瞻性测试集。将模型的表现与放射科医生进行比较,并评估模型辅助对初级和高级放射科医生表现的影响。评价指标包括Dice相似系数(DSC)和准确性。结果共有2131例fll患者(平均年龄56±[SD] 12岁,女性1476例)被纳入训练组、内部测试组、外部测试组和前瞻性测试组。三个测试集中肝脏和肿瘤分割的平均DSC值分别为0.98和0.96。三个测试集的特征和病变分类的平均准确率分别为93%和97%。lift辅助读数提高了初级放射科医生的诊断准确性(平均提高5.3%,P < .001),减少了阅读时间(平均减少34.5秒,P < .001),增强了他们的信心(平均提高0.3点,P < .001)。结论所建立的DL模型能准确地检测和分类fll,提高了初级放射科医师的诊断准确率和诊断效率。©RSNA, 2025年。
本文章由计算机程序翻译,如有差异,请以英文原文为准。