SGRRG: Leveraging radiology scene graphs for improved and abnormality-aware radiology report generation

IF 4.9 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics Pub Date : 2025-09-15 DOI:10.1016/j.compmedimag.2025.102644

Jun Wang , Lixing Zhu , Abhir Bhalerao , Yulan He

{"title":"SGRRG: Leveraging radiology scene graphs for improved and abnormality-aware radiology report generation","authors":"Jun Wang , Lixing Zhu , Abhir Bhalerao , Yulan He","doi":"10.1016/j.compmedimag.2025.102644","DOIUrl":null,"url":null,"abstract":"<div><div>Radiology report generation (RRG) methods often lack sufficient medical knowledge to produce clinically accurate reports. A scene graph provides comprehensive information for describing objects within an image. However, automatically generated radiology scene graphs (RSG) may contain noise annotations and highly overlapping regions, posing challenges in utilizing RSG to enhance RRG. To this end, we propose Scene Graph aided RRG (SGRRG), a framework that leverages an automatically generated RSG and copes with noisy supervision problems in the RSG with a transformer-based module, effectively distilling medical knowledge in an end-to-end manner. SGRRG is composed of a dedicated scene graph encoder responsible for translating the radiography into a RSG, and a scene graph-aided decoder that takes advantage of both patch-level and region-level visual information and mitigates the noisy annotation problem in the RSG. The incorporation of both patch-level and region-level features, alongside the integration of the essential RSG construction modules, enhances our framework’s flexibility and robustness, enabling it to readily exploit prior advanced RRG techniques. A fine-grained, sentence-level attention method is designed to better distill the RSG information. Additionally, we introduce two proxy tasks to enhance the model’s ability to produce clinically accurate reports. Extensive experiments demonstrate that SGRRG outperforms previous state-of-the-art methods in report generation and can better capture abnormal findings. Code is available at <span><span>https://github.com/Markin-Wang/SGRRG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102644"},"PeriodicalIF":4.9000,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computerized Medical Imaging and Graphics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0895611125001533","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Radiology report generation (RRG) methods often lack sufficient medical knowledge to produce clinically accurate reports. A scene graph provides comprehensive information for describing objects within an image. However, automatically generated radiology scene graphs (RSG) may contain noise annotations and highly overlapping regions, posing challenges in utilizing RSG to enhance RRG. To this end, we propose Scene Graph aided RRG (SGRRG), a framework that leverages an automatically generated RSG and copes with noisy supervision problems in the RSG with a transformer-based module, effectively distilling medical knowledge in an end-to-end manner. SGRRG is composed of a dedicated scene graph encoder responsible for translating the radiography into a RSG, and a scene graph-aided decoder that takes advantage of both patch-level and region-level visual information and mitigates the noisy annotation problem in the RSG. The incorporation of both patch-level and region-level features, alongside the integration of the essential RSG construction modules, enhances our framework’s flexibility and robustness, enabling it to readily exploit prior advanced RRG techniques. A fine-grained, sentence-level attention method is designed to better distill the RSG information. Additionally, we introduce two proxy tasks to enhance the model’s ability to produce clinically accurate reports. Extensive experiments demonstrate that SGRRG outperforms previous state-of-the-art methods in report generation and can better capture abnormal findings. Code is available at https://github.com/Markin-Wang/SGRRG.

查看原文本刊更多论文

SGRRG：利用放射学场景图来改进和异常感知放射学报告生成。

放射学报告生成（RRG）方法往往缺乏足够的医学知识，以产生临床准确的报告。场景图为描述图像中的对象提供了全面的信息。然而，自动生成的放射场景图（RSG）可能包含噪声注释和高度重叠的区域，这给利用RSG增强RRG带来了挑战。为此，我们提出了场景图辅助RRG （SGRRG）框架，该框架利用自动生成的RSG，并使用基于变压器的模块处理RSG中的噪声监督问题，有效地以端到端方式提取医学知识。SGRRG由一个专门的场景图编码器和一个场景图辅助解码器组成，前者负责将射线照相转换为RSG，而前者利用了补丁级和区域级视觉信息，并减轻了RSG中的噪声注释问题。结合补丁级和区域级功能，以及基本RSG构建模块的集成，增强了我们框架的灵活性和稳健性，使其能够轻松利用先前的先进RRG技术。为了更好地提取RSG信息，设计了一种细粒度的句子级注意方法。此外，我们引入了两个代理任务，以提高模型产生临床准确报告的能力。大量的实验表明，SGRRG在报告生成方面优于以前最先进的方法，可以更好地捕获异常发现。代码可从https://github.com/Markin-Wang/SGRRG获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computerized Medical Imaging and Graphics 医学-核医学

CiteScore

10.70

自引率

3.50%

发文量

审稿时长

26 days

期刊介绍： The purpose of the journal Computerized Medical Imaging and Graphics is to act as a source for the exchange of research results concerning algorithmic advances, development, and application of digital imaging in disease detection, diagnosis, intervention, prevention, precision medicine, and population health. Included in the journal will be articles on novel computerized imaging or visualization techniques, including artificial intelligence and machine learning, augmented reality for surgical planning and guidance, big biomedical data visualization, computer-aided diagnosis, computerized-robotic surgery, image-guided therapy, imaging scanning and reconstruction, mobile and tele-imaging, radiomics, and imaging integration and modeling with other information relevant to digital health. The types of biomedical imaging include: magnetic resonance, computed tomography, ultrasound, nuclear medicine, X-ray, microwave, optical and multi-photon microscopy, video and sensory imaging, and the convergence of biomedical images with other non-imaging datasets.