Anatomy-guided slice-description interaction for multimodal brain disease diagnosis based on 3D image and radiological report

IF 4.9 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics Pub Date : 2025-04-25 DOI:10.1016/j.compmedimag.2025.102556

Xin Gao , Meihui Zhang , Junjie Li , Shanbo Zhao , Zhizheng Zhuo , Liying Qu , Jinyuan Weng , Li Chai , Yunyun Duan , Chuyang Ye , Yaou Liu

{"title":"Anatomy-guided slice-description interaction for multimodal brain disease diagnosis based on 3D image and radiological report","authors":"Xin Gao , Meihui Zhang , Junjie Li , Shanbo Zhao , Zhizheng Zhuo , Liying Qu , Jinyuan Weng , Li Chai , Yunyun Duan , Chuyang Ye , Yaou Liu","doi":"10.1016/j.compmedimag.2025.102556","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate brain disease diagnosis based on radiological images is desired in clinical practice as it can facilitate early intervention and reduce the risk of damage. However, existing unimodal image-based models struggle to process high-dimensional 3D brain imaging data effectively. Multimodal disease diagnosis approaches based on medical images and corresponding radiological reports achieved promising progress with the development of vision-language models. However, most multimodal methods handle 2D images and cannot be directly applied to brain disease diagnosis that uses 3D images. Therefore, in this work we develop a multimodal brain disease diagnosis model that takes 3D brain images and their radiological reports as input. Motivated by the fact that radiologists scroll through image slices and write important descriptions into the report accordingly, we propose a slice-description cross-modality interaction mechanism to realize fine-grained multimodal data interaction. Moreover, since previous medical research has demonstrated potential correlation between anatomical location of anomalies and diagnosis results, we further explore the use of brain anatomical prior knowledge to improve the multimodal interaction. Based on the report description, the prior knowledge filters the image information by suppressing irrelevant regions and enhancing relevant slices. Our method was validated with two brain disease diagnosis tasks. The results indicate that our model outperforms competing unimodal and multimodal methods for brain disease diagnosis. In particular, it has yielded an average accuracy improvement of 15.87% and 7.39% compared with the image-based and multimodal competing methods, respectively.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"123 ","pages":"Article 102556"},"PeriodicalIF":4.9000,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computerized Medical Imaging and Graphics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0895611125000655","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Accurate brain disease diagnosis based on radiological images is desired in clinical practice as it can facilitate early intervention and reduce the risk of damage. However, existing unimodal image-based models struggle to process high-dimensional 3D brain imaging data effectively. Multimodal disease diagnosis approaches based on medical images and corresponding radiological reports achieved promising progress with the development of vision-language models. However, most multimodal methods handle 2D images and cannot be directly applied to brain disease diagnosis that uses 3D images. Therefore, in this work we develop a multimodal brain disease diagnosis model that takes 3D brain images and their radiological reports as input. Motivated by the fact that radiologists scroll through image slices and write important descriptions into the report accordingly, we propose a slice-description cross-modality interaction mechanism to realize fine-grained multimodal data interaction. Moreover, since previous medical research has demonstrated potential correlation between anatomical location of anomalies and diagnosis results, we further explore the use of brain anatomical prior knowledge to improve the multimodal interaction. Based on the report description, the prior knowledge filters the image information by suppressing irrelevant regions and enhancing relevant slices. Our method was validated with two brain disease diagnosis tasks. The results indicate that our model outperforms competing unimodal and multimodal methods for brain disease diagnosis. In particular, it has yielded an average accuracy improvement of 15.87% and 7.39% compared with the image-based and multimodal competing methods, respectively.

Abstract Image

查看原文本刊更多论文

基于三维图像和放射学报告的多模态脑疾病解剖引导的切片描述交互诊断

在临床实践中需要基于放射图像的准确脑疾病诊断，因为它可以促进早期干预并降低损害的风险。然而，现有的基于单峰图像的模型难以有效地处理高维三维脑成像数据。随着视觉语言模型的发展，基于医学图像和相应的放射学报告的多模态疾病诊断方法取得了可喜的进展。然而，大多数多模态方法处理2D图像，不能直接应用于使用3D图像的脑部疾病诊断。因此，在这项工作中，我们开发了一种多模式脑疾病诊断模型，该模型将3D脑图像及其放射学报告作为输入。鉴于放射科医师滚动图像切片并将重要描述写入报告的事实，我们提出了一种切片-描述跨模态交互机制，以实现细粒度的多模态数据交互。此外，由于先前的医学研究已经证明了异常解剖位置与诊断结果之间的潜在相关性，我们进一步探索使用脑解剖先验知识来改善多模态相互作用。基于报告描述，先验知识通过抑制不相关区域和增强相关切片来过滤图像信息。我们的方法通过两个脑部疾病诊断任务得到了验证。结果表明，我们的模型优于竞争的单峰和多峰脑疾病诊断方法。特别是，与基于图像和多模态竞争方法相比，该方法的平均准确率分别提高了15.87%和7.39%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computerized Medical Imaging and Graphics 医学-核医学

CiteScore

10.70

自引率

3.50%

发文量

审稿时长

26 days

期刊介绍： The purpose of the journal Computerized Medical Imaging and Graphics is to act as a source for the exchange of research results concerning algorithmic advances, development, and application of digital imaging in disease detection, diagnosis, intervention, prevention, precision medicine, and population health. Included in the journal will be articles on novel computerized imaging or visualization techniques, including artificial intelligence and machine learning, augmented reality for surgical planning and guidance, big biomedical data visualization, computer-aided diagnosis, computerized-robotic surgery, image-guided therapy, imaging scanning and reconstruction, mobile and tele-imaging, radiomics, and imaging integration and modeling with other information relevant to digital health. The types of biomedical imaging include: magnetic resonance, computed tomography, ultrasound, nuclear medicine, X-ray, microwave, optical and multi-photon microscopy, video and sensory imaging, and the convergence of biomedical images with other non-imaging datasets.