[CRAKUT：整合对比区域注意力和临床先验知识在u型变压器放射学报告生成]。

Q3 Medicine

南方医科大学学报杂志 Pub Date : 2025-06-20 DOI:10.12122/j.issn.1673-4254.2025.06.24

Yedong Liang, Xiongfeng Zhu, Meiyan Huang, Wencong Zhang, Hanyu Guo, Qianjin Feng

{"title":"[CRAKUT：整合对比区域注意力和临床先验知识在u型变压器放射学报告生成]。","authors":"Yedong Liang, Xiongfeng Zhu, Meiyan Huang, Wencong Zhang, Hanyu Guo, Qianjin Feng","doi":"10.12122/j.issn.1673-4254.2025.06.24","DOIUrl":null,"url":null,"abstract":"Objectives: We propose a Contrastive Regional Attention and Prior Knowledge-Infused U-Transformer model (CRAKUT) to address the challenges of imbalanced text distribution, lack of contextual clinical knowledge, and cross-modal information transformation to enhance the quality of generated radiology reports.Methods: The CRAKUT model comprises 3 key components, including an image encoder that utilizes common normal images from the dataset for extracting enhanced visual features, an external knowledge infuser that incorporates clinical prior knowledge, and a U-Transformer that facilitates cross-modal information conversion from vision to language. The contrastive regional attention in the image encoder was introduced to enhance the features of abnormal regions by emphasizing the difference between normal and abnormal semantic features. Additionally, the clinical prior knowledge infuser within the text encoder integrates clinical history and knowledge graphs generated by ChatGPT. Finally, the U-Transformer was utilized to connect the multi-modal encoder and the report decoder in a U-connection schema, and multiple types of information were used to fuse and obtain the final report.Results: We evaluated the proposed CRAKUT model on two publicly available CXR datasets (IU-Xray and MIMIC-CXR). The experimental results showed that the CRAKUT model achieved a state-of-the-art performance on report generation with a BLEU-4 score of 0.159, a ROUGE-L score of 0.353, and a CIDEr score of 0.500 in MIMIC-CXR dataset; the model also had a METEOR score of 0.258 in IU-Xray dataset, outperforming all the comparison models.Conclusions: The proposed method has great potential for application in clinical disease diagnoses and report generation.","PeriodicalId":18962,"journal":{"name":"南方医科大学学报杂志","volume":"45 6","pages":"1343-1352"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12204827/pdf/","citationCount":"0","resultStr":"{\"title\":\"[CRAKUT:integrating contrastive regional attention and clinical prior knowledge in U-transformer for radiology report generation].\",\"authors\":\"Yedong Liang, Xiongfeng Zhu, Meiyan Huang, Wencong Zhang, Hanyu Guo, Qianjin Feng\",\"doi\":\"10.12122/j.issn.1673-4254.2025.06.24\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Objectives: We propose a Contrastive Regional Attention and Prior Knowledge-Infused U-Transformer model (CRAKUT) to address the challenges of imbalanced text distribution, lack of contextual clinical knowledge, and cross-modal information transformation to enhance the quality of generated radiology reports.Methods: The CRAKUT model comprises 3 key components, including an image encoder that utilizes common normal images from the dataset for extracting enhanced visual features, an external knowledge infuser that incorporates clinical prior knowledge, and a U-Transformer that facilitates cross-modal information conversion from vision to language. The contrastive regional attention in the image encoder was introduced to enhance the features of abnormal regions by emphasizing the difference between normal and abnormal semantic features. Additionally, the clinical prior knowledge infuser within the text encoder integrates clinical history and knowledge graphs generated by ChatGPT. Finally, the U-Transformer was utilized to connect the multi-modal encoder and the report decoder in a U-connection schema, and multiple types of information were used to fuse and obtain the final report.Results: We evaluated the proposed CRAKUT model on two publicly available CXR datasets (IU-Xray and MIMIC-CXR). The experimental results showed that the CRAKUT model achieved a state-of-the-art performance on report generation with a BLEU-4 score of 0.159, a ROUGE-L score of 0.353, and a CIDEr score of 0.500 in MIMIC-CXR dataset; the model also had a METEOR score of 0.258 in IU-Xray dataset, outperforming all the comparison models.Conclusions: The proposed method has great potential for application in clinical disease diagnoses and report generation.\",\"PeriodicalId\":18962,\"journal\":{\"name\":\"南方医科大学学报杂志\",\"volume\":\"45 6\",\"pages\":\"1343-1352\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12204827/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"南方医科大学学报杂志\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.12122/j.issn.1673-4254.2025.06.24\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"南方医科大学学报杂志","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12122/j.issn.1673-4254.2025.06.24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}

引用次数: 0

摘要

目的：我们提出了一个对比区域关注和先验知识注入的U-Transformer模型（CRAKUT），以解决文本分布不平衡、缺乏上下文临床知识和跨模式信息转换的挑战，以提高生成的放射学报告的质量。CRAKUT模型由3个关键组件组成，包括利用数据集中常见的正常图像提取增强视觉特征的图像编码器，结合临床先验知识的外部知识注入器，以及促进从视觉到语言的跨模态信息转换的U-Transformer。在图像编码器中引入对比区域关注，通过强调正常和异常语义特征的区别来增强异常区域的特征。此外，文本编码器中的临床先验知识注入器集成了ChatGPT生成的临床病史和知识图。最后，利用U-Transformer将多模态编码器和报告解码器以u -连接模式连接起来，并利用多种类型的信息进行融合，得到最终的报告。结果：我们在两个公开可用的CXR数据集（IU-Xray和MIMIC-CXR）上评估了拟议的CRAKUT模型。实验结果表明，CRAKUT模型在MIMIC-CXR数据集上的BLEU-4得分为0.159，ROUGE-L得分为0.353，CIDEr得分为0.500，达到了最先进的报告生成性能；该模型在IU-Xray数据集上的METEOR得分为0.258，优于所有比较模型。结论：该方法在临床疾病诊断和报告生成方面具有很大的应用潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

[CRAKUT:integrating contrastive regional attention and clinical prior knowledge in U-transformer for radiology report generation].

Objectives: We propose a Contrastive Regional Attention and Prior Knowledge-Infused U-Transformer model (CRAKUT) to address the challenges of imbalanced text distribution, lack of contextual clinical knowledge, and cross-modal information transformation to enhance the quality of generated radiology reports.

Methods: The CRAKUT model comprises 3 key components, including an image encoder that utilizes common normal images from the dataset for extracting enhanced visual features, an external knowledge infuser that incorporates clinical prior knowledge, and a U-Transformer that facilitates cross-modal information conversion from vision to language. The contrastive regional attention in the image encoder was introduced to enhance the features of abnormal regions by emphasizing the difference between normal and abnormal semantic features. Additionally, the clinical prior knowledge infuser within the text encoder integrates clinical history and knowledge graphs generated by ChatGPT. Finally, the U-Transformer was utilized to connect the multi-modal encoder and the report decoder in a U-connection schema, and multiple types of information were used to fuse and obtain the final report.

Results: We evaluated the proposed CRAKUT model on two publicly available CXR datasets (IU-Xray and MIMIC-CXR). The experimental results showed that the CRAKUT model achieved a state-of-the-art performance on report generation with a BLEU-4 score of 0.159, a ROUGE-L score of 0.353, and a CIDEr score of 0.500 in MIMIC-CXR dataset; the model also had a METEOR score of 0.258 in IU-Xray dataset, outperforming all the comparison models.

Conclusions: The proposed method has great potential for application in clinical disease diagnoses and report generation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊