Label correlated contrastive learning for medical report generation

IF 4.9 2区医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computer methods and programs in biomedicine Pub Date : 2024-11-14 DOI:10.1016/j.cmpb.2024.108482

Xinyao Liu , Junchang Xin , Bingtian Dai , Qi Shen , Zhihong Huang , Zhiqiong Wang

{"title":"Label correlated contrastive learning for medical report generation","authors":"Xinyao Liu , Junchang Xin , Bingtian Dai , Qi Shen , Zhihong Huang , Zhiqiong Wang","doi":"10.1016/j.cmpb.2024.108482","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective:</h3><div>Automatic generation of medical reports reduces both the burden on radiologists and the possibility of errors due to the inexperience of radiologists. The model that utilizes attention mechanism and contrastive learning can generate medical reports by capturing both general and specific semantics. However, existing contrastive learning methods ignore the specificity of medical data, that is, a patient may suffer from multiple diseases at the same time. This means that the lack of fine-grained relationships for contrastive learning will lead to the problem of insufficient specificity.</div></div><div><h3>Methods:</h3><div>To address the above problem, a label correlated contrastive learning method is proposed to encourage the model to generate higher-quality reports. Firstly, the refined similarity description matrix of the contrastive relationship between the reports is obtained by calculating the similarities between the multi-label classification of the reports. Secondly, the representations of image features and the embeddings containing semantic information from the decoder are projected into a hidden space. Thirdly, label correlated contrastive learning is performed with the hidden representations of the image, the embeddings of the text, and the similarity matrix. Through contrastive learning, the “hard” negative samples that share more labels with the target sample are being assigned more weights. Finally, label correlated contrastive learning and attention mechanism are combined to generate reports.</div></div><div><h3>Results:</h3><div>Comprehensive experiments are conducted on widely used datasets, IU X-ray and MIMIC-CXR. Specifically, on IU X-ray dataset, our method achieves METEOR and ROUGE-L scores of 0.198 and 0.392, respectively. On MIMIC-CXR dataset, our method achieves precision, recall, and F-1 scores of 0.384, 0.376, and 0.304, respectively. The results indicate that proposed method outperforms previous state-of-the-art models.</div></div><div><h3>Conclusions:</h3><div>This work improves the performance of automatically generating medical reports, making their application in computer-aided diagnosis feasible.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"258 ","pages":"Article 108482"},"PeriodicalIF":4.9000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260724004759","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Background and Objective:

Automatic generation of medical reports reduces both the burden on radiologists and the possibility of errors due to the inexperience of radiologists. The model that utilizes attention mechanism and contrastive learning can generate medical reports by capturing both general and specific semantics. However, existing contrastive learning methods ignore the specificity of medical data, that is, a patient may suffer from multiple diseases at the same time. This means that the lack of fine-grained relationships for contrastive learning will lead to the problem of insufficient specificity.

Methods:

To address the above problem, a label correlated contrastive learning method is proposed to encourage the model to generate higher-quality reports. Firstly, the refined similarity description matrix of the contrastive relationship between the reports is obtained by calculating the similarities between the multi-label classification of the reports. Secondly, the representations of image features and the embeddings containing semantic information from the decoder are projected into a hidden space. Thirdly, label correlated contrastive learning is performed with the hidden representations of the image, the embeddings of the text, and the similarity matrix. Through contrastive learning, the “hard” negative samples that share more labels with the target sample are being assigned more weights. Finally, label correlated contrastive learning and attention mechanism are combined to generate reports.

Results:

Comprehensive experiments are conducted on widely used datasets, IU X-ray and MIMIC-CXR. Specifically, on IU X-ray dataset, our method achieves METEOR and ROUGE-L scores of 0.198 and 0.392, respectively. On MIMIC-CXR dataset, our method achieves precision, recall, and F-1 scores of 0.384, 0.376, and 0.304, respectively. The results indicate that proposed method outperforms previous state-of-the-art models.

Conclusions:

This work improves the performance of automatically generating medical reports, making their application in computer-aided diagnosis feasible.

查看原文本刊更多论文

用于医学报告生成的标签相关对比学习。

背景和目的：自动生成医疗报告既减轻了放射科医生的负担，也降低了放射科医生因缺乏经验而出错的可能性。利用注意力机制和对比学习的模型可以通过捕捉一般语义和特定语义来生成医疗报告。然而，现有的对比学习方法忽略了医疗数据的特殊性，即一个病人可能同时患有多种疾病。这意味着，缺乏细粒度关系的对比学习将导致特异性不足的问题：针对上述问题，我们提出了一种标签相关对比学习方法，以鼓励模型生成更高质量的报告。首先，通过计算报告的多标签分类之间的相似性，得到报告之间对比关系的精炼相似性描述矩阵。其次，图像特征的表示和来自解码器的包含语义信息的嵌入被投射到一个隐藏空间。第三，利用图像的隐藏表示、文本的嵌入和相似性矩阵进行标签相关的对比学习。通过对比学习，与目标样本共享更多标签的 "硬 "负样本将被赋予更多权重。最后，将标签相关的对比学习和注意力机制结合起来生成报告：我们在广泛使用的数据集 IU X-ray 和 MIMIC-CXR 上进行了综合实验。具体来说，在 IU X 光数据集上，我们的方法获得的 METEOR 和 ROUGE-L 分数分别为 0.198 和 0.392。在 MIMIC-CXR 数据集上，我们的方法获得的精确度、召回率和 F-1 分数分别为 0.384、0.376 和 0.304。结果表明，所提出的方法优于之前的最先进模型：这项工作提高了自动生成医疗报告的性能，使其在计算机辅助诊断中的应用变得可行。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer methods and programs in biomedicine 工程技术-工程：生物医学

CiteScore

12.30

自引率

6.60%

发文量

601

审稿时长

135 days

期刊介绍： To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.