Label knowledge guided transformer for automatic radiology report generation

IF 4.8 2区医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computer methods and programs in biomedicine Pub Date : 2025-05-23 DOI:10.1016/j.cmpb.2025.108877

Rui Wang, Jianguo Liang

{"title":"Label knowledge guided transformer for automatic radiology report generation","authors":"Rui Wang, Jianguo Liang","doi":"10.1016/j.cmpb.2025.108877","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective</h3><div>The task of automatically generating radiology reports is a key research area at the intersection of computer science and medicine, aiming to enable computers to generate corresponding reports on the basis of radiology images. This field currently faces a significant data bias issue, which causes words describing diseases to be overshadowed by words describing normal regions in the reports.</div></div><div><h3>Methods</h3><div>To address this, we propose the label knowledge guided transformer model for generating radiology reports. Specifically, our model incorporates a Multi Feature Extraction module and a Dual-branch Collaborative Attention module. The Multi Feature Extraction module leverages medical knowledge graphs and feature clustering algorithms to optimize the label feature extraction process from both the prediction and encoding of label information, making it the first module specifically designed to reduce redundant label features. The Dual-branch Collaborative Attention module uses two parallel attention mechanisms to simultaneously compute visual features and label features, and prevents the direct integration of label features into visual features, thereby effectively balancing the model's attention between label features and visual features.</div></div><div><h3>Results</h3><div>We conduct experimental tests using the IU X-Ray and MIMIC-CXR datasets under six natural language generation evaluation metrics and analyze the results. Experimental results demonstrate that our model achieves state-of-the-art (SOTA) performance. Compared with the baseline models, the label knowledge guided transformer achieves an average improvement of 23.3% on the IU X-Ray dataset and 20.7% on the MIMIC-CXR dataset.</div></div><div><h3>Conclusion</h3><div>Our model has strong capabilities in capturing abnormal features, effectively mitigating the adverse effects caused by data bias, and demonstrates significant potential to enhance the quality and accuracy of automatically generated radiology reports.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"269 ","pages":"Article 108877"},"PeriodicalIF":4.8000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260725002949","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Background and Objective

The task of automatically generating radiology reports is a key research area at the intersection of computer science and medicine, aiming to enable computers to generate corresponding reports on the basis of radiology images. This field currently faces a significant data bias issue, which causes words describing diseases to be overshadowed by words describing normal regions in the reports.

Methods

To address this, we propose the label knowledge guided transformer model for generating radiology reports. Specifically, our model incorporates a Multi Feature Extraction module and a Dual-branch Collaborative Attention module. The Multi Feature Extraction module leverages medical knowledge graphs and feature clustering algorithms to optimize the label feature extraction process from both the prediction and encoding of label information, making it the first module specifically designed to reduce redundant label features. The Dual-branch Collaborative Attention module uses two parallel attention mechanisms to simultaneously compute visual features and label features, and prevents the direct integration of label features into visual features, thereby effectively balancing the model's attention between label features and visual features.

Results

We conduct experimental tests using the IU X-Ray and MIMIC-CXR datasets under six natural language generation evaluation metrics and analyze the results. Experimental results demonstrate that our model achieves state-of-the-art (SOTA) performance. Compared with the baseline models, the label knowledge guided transformer achieves an average improvement of 23.3% on the IU X-Ray dataset and 20.7% on the MIMIC-CXR dataset.

Conclusion

Our model has strong capabilities in capturing abnormal features, effectively mitigating the adverse effects caused by data bias, and demonstrates significant potential to enhance the quality and accuracy of automatically generated radiology reports.

查看原文本刊更多论文

用于自动生成放射学报告的标签知识引导转换器

背景与目的自动生成放射学报告任务是计算机科学与医学交叉领域的一个重点研究领域，旨在使计算机能够根据放射学图像生成相应的报告。这一领域目前面临着严重的数据偏差问题，导致报告中描述疾病的词语被描述正常区域的词语所掩盖。方法针对这一问题，我们提出了标签知识引导下生成放射学报告的变压器模型。具体来说，我们的模型包含了一个多特征提取模块和一个双分支协作关注模块。多特征提取模块利用医学知识图和特征聚类算法，从标签信息的预测和编码两方面优化标签特征提取过程，是第一个专门为减少冗余标签特征而设计的模块。双分支协同注意模块采用两种并行的注意机制，同时计算视觉特征和标签特征，防止标签特征直接集成到视觉特征中，从而有效平衡模型对标签特征和视觉特征的注意。结果使用IU X-Ray和MIMIC-CXR数据集在6个自然语言生成评价指标下进行了实验测试，并对结果进行了分析。实验结果表明，我们的模型达到了最先进的SOTA性能。与基线模型相比，标签知识引导的变压器在IU X-Ray数据集上平均提高23.3%，在MIMIC-CXR数据集上平均提高20.7%。结论该模型具有较强的异常特征捕获能力，可有效减轻数据偏倚带来的不良影响，在提高自动生成放射学报告的质量和准确性方面具有较大的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer methods and programs in biomedicine 工程技术-工程：生物医学

CiteScore

12.30

自引率

6.60%

发文量

601

审稿时长

135 days

期刊介绍： To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.