Youyuan Xue , Yun Tan , Ling Tan , Jiaohua Qin , Xuyu Xiang
{"title":"Generating radiology reports via auxiliary signal guidance and a memory-driven network","authors":"Youyuan Xue , Yun Tan , Ling Tan , Jiaohua Qin , Xuyu Xiang","doi":"10.1016/j.eswa.2023.121260","DOIUrl":null,"url":null,"abstract":"<div><p><span><span><span>Automatically generating medical image reports is a gratifying task. For doctors, it can reduce the heavy burden of writing reports, and for patients, it can reduce the waiting time for reports; it can also avoid misdiagnosis and missed diagnoses caused by human factors. However, this task still faces enormous challenges due to the problem of visual and textual data bias and the complex relationships among the components of medical reports. To this end, in this work, we propose an auxiliary signal guidance and memory-driven (ASGMD) network that can be used to generate medical reports automatically. It includes three modules: an Auxiliary Signal Guidance Module (ASG), a text sequential </span>attention mechanism (TSAM) module, and a Memory Mechanism-Driven Decoding Module (MMDD). Given a medical image of a patient, </span>radiologists<span> usually focus on the abnormal area first, then browse the global information included in the image and write a corresponding report. Similar to the above working mode, the ASG module enhances the features of the abnormal areas of medical images by introducing auxiliary signals that alleviate the problem of visual data bias. We design a novel TSAM module that explores the consistency of medical report context and enhances essential medical information in reports to reduce textual data bias. Finally, the MMDD module integrates visual and textual knowledge to achieve dynamic decoding and generate a final report. The experimental results show that the proposed method outperforms state-of-the-art models on various evaluation metrics on the two public datasets, IU-Xray and MIMIC-CXR. To make our results reproducible, our code has been released at </span></span><span>https://github.com/shangchengLu/ASGMDN</span><svg><path></path></svg>.</p></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"237 ","pages":"Article 121260"},"PeriodicalIF":7.5000,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417423017621","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 1
Abstract
Automatically generating medical image reports is a gratifying task. For doctors, it can reduce the heavy burden of writing reports, and for patients, it can reduce the waiting time for reports; it can also avoid misdiagnosis and missed diagnoses caused by human factors. However, this task still faces enormous challenges due to the problem of visual and textual data bias and the complex relationships among the components of medical reports. To this end, in this work, we propose an auxiliary signal guidance and memory-driven (ASGMD) network that can be used to generate medical reports automatically. It includes three modules: an Auxiliary Signal Guidance Module (ASG), a text sequential attention mechanism (TSAM) module, and a Memory Mechanism-Driven Decoding Module (MMDD). Given a medical image of a patient, radiologists usually focus on the abnormal area first, then browse the global information included in the image and write a corresponding report. Similar to the above working mode, the ASG module enhances the features of the abnormal areas of medical images by introducing auxiliary signals that alleviate the problem of visual data bias. We design a novel TSAM module that explores the consistency of medical report context and enhances essential medical information in reports to reduce textual data bias. Finally, the MMDD module integrates visual and textual knowledge to achieve dynamic decoding and generate a final report. The experimental results show that the proposed method outperforms state-of-the-art models on various evaluation metrics on the two public datasets, IU-Xray and MIMIC-CXR. To make our results reproducible, our code has been released at https://github.com/shangchengLu/ASGMDN.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.