End-to-End Clustering Enhanced Contrastive Learning for Radiology Reports Generation

IF 5.3 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Emerging Topics in Computational Intelligence Pub Date : 2024-09-02 DOI:10.1109/TETCI.2024.3449876

Xinyao Liu;Junchang Xin;Qi Shen;Chuangang Li;Zhihong Huang;Zhiqiong Wang

{"title":"End-to-End Clustering Enhanced Contrastive Learning for Radiology Reports Generation","authors":"Xinyao Liu;Junchang Xin;Qi Shen;Chuangang Li;Zhihong Huang;Zhiqiong Wang","doi":"10.1109/TETCI.2024.3449876","DOIUrl":null,"url":null,"abstract":"With the rapid growth of medical imaging data, radiologists must dedicate a significant amount of time to report writing. Automated generation of radiology reports not only alleviates the heavy workload of physicians but, more importantly, can reduce mistakes and oversights caused by insufficient experience. However, due to substantial data bias in medical data, prior studies using typical cross-entropy loss in encoder-decoder architectures often result in generalized descriptions of normal tissues and may overlook crucial clinical abnormalities. Therefore, we propose a clustering enhanced contrastive learning model named CECL to generate more diverse radiology reports, and it is worth noting that our method is end-to-end trainable. Specifically, an adaptive alignment fusion encoder-decoder network (AAF) is constructed by fusing the image features and text semantic features from the transformer decoder, eliminating information redundancy across different modalities. Moreover, a label-guided contrastive learning (LCL) module is proposed. In detail, clustering is performed on the fused features using Gaussian competition. Supervised contrastive learning is conducted based on the clustering results to enhance feature representation ability. We evaluate the CECL on two widely used publicly available datasets, IU X-ray and MIMIC-CXR, using NLG and CE metrics. The experimental results demonstrate that CECL can produce fluent reports with more descriptions of anomalies, outperforming other state-of-the-art methods with higher clinical correctness.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 2","pages":"1780-1794"},"PeriodicalIF":5.3000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10663478/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

With the rapid growth of medical imaging data, radiologists must dedicate a significant amount of time to report writing. Automated generation of radiology reports not only alleviates the heavy workload of physicians but, more importantly, can reduce mistakes and oversights caused by insufficient experience. However, due to substantial data bias in medical data, prior studies using typical cross-entropy loss in encoder-decoder architectures often result in generalized descriptions of normal tissues and may overlook crucial clinical abnormalities. Therefore, we propose a clustering enhanced contrastive learning model named CECL to generate more diverse radiology reports, and it is worth noting that our method is end-to-end trainable. Specifically, an adaptive alignment fusion encoder-decoder network (AAF) is constructed by fusing the image features and text semantic features from the transformer decoder, eliminating information redundancy across different modalities. Moreover, a label-guided contrastive learning (LCL) module is proposed. In detail, clustering is performed on the fused features using Gaussian competition. Supervised contrastive learning is conducted based on the clustering results to enhance feature representation ability. We evaluate the CECL on two widely used publicly available datasets, IU X-ray and MIMIC-CXR, using NLG and CE metrics. The experimental results demonstrate that CECL can produce fluent reports with more descriptions of anomalies, outperforming other state-of-the-art methods with higher clinical correctness.

查看原文本刊更多论文

端到端聚类增强对比学习的放射学报告生成

随着医学影像数据的快速增长，放射科医生必须投入大量的时间来撰写报告。自动生成放射学报告不仅减轻了医生繁重的工作量，更重要的是，可以减少由于经验不足而导致的错误和疏忽。然而，由于医学数据中存在大量的数据偏差，先前在编码器-解码器架构中使用典型交叉熵损失的研究往往导致对正常组织的广义描述，而可能忽略关键的临床异常。因此，我们提出了一种名为CECL的聚类增强对比学习模型，以生成更多样化的放射学报告，值得注意的是，我们的方法是端到端可训练的。具体而言，通过融合来自变压器解码器的图像特征和文本语义特征，构建自适应对齐融合编码器网络（AAF），消除了不同模态的信息冗余。此外，我们还提出了一个标签引导的对比学习模块。利用高斯竞争对融合后的特征进行聚类。基于聚类结果进行监督式对比学习，增强特征表征能力。我们使用NLG和CE指标对两个广泛使用的公开数据集（IU X-ray和MIMIC-CXR）进行CECL评估。实验结果表明，CECL可以生成更流畅的报告，描述更多的异常，优于其他先进的方法，具有更高的临床正确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Emerging Topics in Computational Intelligence Mathematics-Control and Optimization

CiteScore

10.30

自引率

7.50%

发文量

147

期刊介绍： The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys. TETCI is an electronics only publication. TETCI publishes six issues per year. Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.