Lisong Dai, Jiayu Lei, Fenglong Ma, Zheng Sun, Haiyan Du, Houwang Zhang, Jingxuan Jiang, Jianyong Wei, Dan Wang, Guang Tan, Xinyu Song, Jinyu Zhu, Qianqian Zhao, Songtao Ai, Ai Shang, Zhaohui Li, Ya Zhang, Yuehua Li
{"title":"Boosting Deep Learning for Interpretable Brain MRI Lesion Detection through the Integration of Radiology Report Information.","authors":"Lisong Dai, Jiayu Lei, Fenglong Ma, Zheng Sun, Haiyan Du, Houwang Zhang, Jingxuan Jiang, Jianyong Wei, Dan Wang, Guang Tan, Xinyu Song, Jinyu Zhu, Qianqian Zhao, Songtao Ai, Ai Shang, Zhaohui Li, Ya Zhang, Yuehua Li","doi":"10.1148/ryai.230520","DOIUrl":null,"url":null,"abstract":"<p><p>Purpose To guide the attention of a deep learning (DL) model toward MRI characteristics of brain lesions by incorporating radiology report-derived textual features to achieve interpretable lesion detection. Materials and Methods In this retrospective study, 35 282 brain MRI scans (January 2018 to June 2023) and corresponding radiology reports from center 1 were used for training, validation, and internal testing. A total of 2655 brain MRI scans (January 2022 to December 2022) from centers 2-5 were reserved for external testing. Textual features were extracted from radiology reports to guide a DL model (ReportGuidedNet) focusing on lesion characteristics. Another DL model (PlainNet) without textual features was developed for comparative analysis. Both models identified 15 conditions, including 14 diseases and normal brains. Performance of each model was assessed by calculating macro-averaged area under the receiver operating characteristic curve (ma-AUC) and micro-averaged AUC (mi-AUC). Attention maps, which visualized model attention, were assessed with a five-point Likert scale. Results ReportGuidedNet outperformed PlainNet for all diagnoses on both internal (ma-AUC, 0.93 [95% CI: 0.91, 0.95] vs 0.85 [95% CI: 0.81, 0.88]; mi-AUC, 0.93 [95% CI: 0.90, 0.95] vs 0.89 [95% CI: 0.83, 0.92]) and external (ma-AUC, 0.91 [95% CI: 0.88, 0.93] vs 0.75 [95% CI: 0.72, 0.79]; mi-AUC, 0.90 [95% CI: 0.87, 0.92] vs 0.76 [95% CI: 0.72, 0.80]) testing sets. The performance difference between internal and external testing sets was smaller for ReportGuidedNet than for PlainNet (Δma-AUC, 0.03 vs 0.10; Δmi-AUC, 0.02 vs 0.13). The Likert scale score of ReportGuidedNet was higher than that of PlainNet (mean ± SD: 2.50 ± 1.09 vs 1.32 ± 1.20; <i>P</i> < .001). Conclusion The integration of radiology report textual features improved the ability of the DL model to detect brain lesions, thereby enhancing interpretability and generalizability. <b>Keywords:</b> Deep Learning, Computer-aided Diagnosis, Knowledge-driven Model, Radiology Report, Brain MRI <i>Supplemental material is available for this article.</i> Published under a CC BY 4.0 license.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e230520"},"PeriodicalIF":8.1000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology-Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1148/ryai.230520","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose To guide the attention of a deep learning (DL) model toward MRI characteristics of brain lesions by incorporating radiology report-derived textual features to achieve interpretable lesion detection. Materials and Methods In this retrospective study, 35 282 brain MRI scans (January 2018 to June 2023) and corresponding radiology reports from center 1 were used for training, validation, and internal testing. A total of 2655 brain MRI scans (January 2022 to December 2022) from centers 2-5 were reserved for external testing. Textual features were extracted from radiology reports to guide a DL model (ReportGuidedNet) focusing on lesion characteristics. Another DL model (PlainNet) without textual features was developed for comparative analysis. Both models identified 15 conditions, including 14 diseases and normal brains. Performance of each model was assessed by calculating macro-averaged area under the receiver operating characteristic curve (ma-AUC) and micro-averaged AUC (mi-AUC). Attention maps, which visualized model attention, were assessed with a five-point Likert scale. Results ReportGuidedNet outperformed PlainNet for all diagnoses on both internal (ma-AUC, 0.93 [95% CI: 0.91, 0.95] vs 0.85 [95% CI: 0.81, 0.88]; mi-AUC, 0.93 [95% CI: 0.90, 0.95] vs 0.89 [95% CI: 0.83, 0.92]) and external (ma-AUC, 0.91 [95% CI: 0.88, 0.93] vs 0.75 [95% CI: 0.72, 0.79]; mi-AUC, 0.90 [95% CI: 0.87, 0.92] vs 0.76 [95% CI: 0.72, 0.80]) testing sets. The performance difference between internal and external testing sets was smaller for ReportGuidedNet than for PlainNet (Δma-AUC, 0.03 vs 0.10; Δmi-AUC, 0.02 vs 0.13). The Likert scale score of ReportGuidedNet was higher than that of PlainNet (mean ± SD: 2.50 ± 1.09 vs 1.32 ± 1.20; P < .001). Conclusion The integration of radiology report textual features improved the ability of the DL model to detect brain lesions, thereby enhancing interpretability and generalizability. Keywords: Deep Learning, Computer-aided Diagnosis, Knowledge-driven Model, Radiology Report, Brain MRI Supplemental material is available for this article. Published under a CC BY 4.0 license.
期刊介绍:
Radiology: Artificial Intelligence is a bi-monthly publication that focuses on the emerging applications of machine learning and artificial intelligence in the field of imaging across various disciplines. This journal is available online and accepts multiple manuscript types, including Original Research, Technical Developments, Data Resources, Review articles, Editorials, Letters to the Editor and Replies, Special Reports, and AI in Brief.