{"title":"DMR $$^2$$ G: diffusion model for radiology report generation","authors":"Huan Ouyang, Zheng Chang, Binghao Tang, Si Li","doi":"10.1007/s11042-024-20206-x","DOIUrl":null,"url":null,"abstract":"<p>Radiology report generation aims to generate pathological assessments from given radiographic images accurately. Prior methods largely rely on autoregressive models, where the sequential token-by-token generation process always results in longer inference time and suffers from the sequential error accumulation. In order to enhance the efficiency of report generation without compromising diagnostic accuracy, we present a novel radiology report generation approach based on diffusion models. By integrating a graph-guided image feature extractor informed by a radiology knowledge graph, our model adeptly identifies critical abnormalities within images. We also introduce an auxiliary lesion classification loss mechanism using pseudo labels as supervision to align image features and textual disease keyword representations accurately. By adopting the accelerated sampling strategy inherent to diffusion models, our approach significantly reduces the inference time. Through comprehensive evaluation on the IU-Xray and MIMIC-CXR benchmarks, our approach outperforms autoregressive models in inference speed while maintaining high quality, offering a significant advancement in automating radiology report generation task.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":null,"pages":null},"PeriodicalIF":3.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Tools and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11042-024-20206-x","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Radiology report generation aims to generate pathological assessments from given radiographic images accurately. Prior methods largely rely on autoregressive models, where the sequential token-by-token generation process always results in longer inference time and suffers from the sequential error accumulation. In order to enhance the efficiency of report generation without compromising diagnostic accuracy, we present a novel radiology report generation approach based on diffusion models. By integrating a graph-guided image feature extractor informed by a radiology knowledge graph, our model adeptly identifies critical abnormalities within images. We also introduce an auxiliary lesion classification loss mechanism using pseudo labels as supervision to align image features and textual disease keyword representations accurately. By adopting the accelerated sampling strategy inherent to diffusion models, our approach significantly reduces the inference time. Through comprehensive evaluation on the IU-Xray and MIMIC-CXR benchmarks, our approach outperforms autoregressive models in inference speed while maintaining high quality, offering a significant advancement in automating radiology report generation task.
期刊介绍:
Multimedia Tools and Applications publishes original research articles on multimedia development and system support tools as well as case studies of multimedia applications. It also features experimental and survey articles. The journal is intended for academics, practitioners, scientists and engineers who are involved in multimedia system research, design and applications. All papers are peer reviewed.
Specific areas of interest include:
- Multimedia Tools:
- Multimedia Applications:
- Prototype multimedia systems and platforms