{"title":"Label knowledge guided transformer for automatic radiology report generation","authors":"Rui Wang, Jianguo Liang","doi":"10.1016/j.cmpb.2025.108877","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective</h3><div>The task of automatically generating radiology reports is a key research area at the intersection of computer science and medicine, aiming to enable computers to generate corresponding reports on the basis of radiology images. This field currently faces a significant data bias issue, which causes words describing diseases to be overshadowed by words describing normal regions in the reports.</div></div><div><h3>Methods</h3><div>To address this, we propose the label knowledge guided transformer model for generating radiology reports. Specifically, our model incorporates a Multi Feature Extraction module and a Dual-branch Collaborative Attention module. The Multi Feature Extraction module leverages medical knowledge graphs and feature clustering algorithms to optimize the label feature extraction process from both the prediction and encoding of label information, making it the first module specifically designed to reduce redundant label features. The Dual-branch Collaborative Attention module uses two parallel attention mechanisms to simultaneously compute visual features and label features, and prevents the direct integration of label features into visual features, thereby effectively balancing the model's attention between label features and visual features.</div></div><div><h3>Results</h3><div>We conduct experimental tests using the IU X-Ray and MIMIC-CXR datasets under six natural language generation evaluation metrics and analyze the results. Experimental results demonstrate that our model achieves state-of-the-art (SOTA) performance. Compared with the baseline models, the label knowledge guided transformer achieves an average improvement of 23.3% on the IU X-Ray dataset and 20.7% on the MIMIC-CXR dataset.</div></div><div><h3>Conclusion</h3><div>Our model has strong capabilities in capturing abnormal features, effectively mitigating the adverse effects caused by data bias, and demonstrates significant potential to enhance the quality and accuracy of automatically generated radiology reports.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"269 ","pages":"Article 108877"},"PeriodicalIF":4.9000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260725002949","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Background and Objective
The task of automatically generating radiology reports is a key research area at the intersection of computer science and medicine, aiming to enable computers to generate corresponding reports on the basis of radiology images. This field currently faces a significant data bias issue, which causes words describing diseases to be overshadowed by words describing normal regions in the reports.
Methods
To address this, we propose the label knowledge guided transformer model for generating radiology reports. Specifically, our model incorporates a Multi Feature Extraction module and a Dual-branch Collaborative Attention module. The Multi Feature Extraction module leverages medical knowledge graphs and feature clustering algorithms to optimize the label feature extraction process from both the prediction and encoding of label information, making it the first module specifically designed to reduce redundant label features. The Dual-branch Collaborative Attention module uses two parallel attention mechanisms to simultaneously compute visual features and label features, and prevents the direct integration of label features into visual features, thereby effectively balancing the model's attention between label features and visual features.
Results
We conduct experimental tests using the IU X-Ray and MIMIC-CXR datasets under six natural language generation evaluation metrics and analyze the results. Experimental results demonstrate that our model achieves state-of-the-art (SOTA) performance. Compared with the baseline models, the label knowledge guided transformer achieves an average improvement of 23.3% on the IU X-Ray dataset and 20.7% on the MIMIC-CXR dataset.
Conclusion
Our model has strong capabilities in capturing abnormal features, effectively mitigating the adverse effects caused by data bias, and demonstrates significant potential to enhance the quality and accuracy of automatically generated radiology reports.
期刊介绍:
To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine.
Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.