ModFus-PD: synergizing cross-modal attention and contrastive learning for enhanced multimodal diagnosis of Parkinson's disease.

IF 2.3 4区医学 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in Computational Neuroscience Pub Date : 2025-07-16 eCollection Date: 2025-01-01 DOI:10.3389/fncom.2025.1604399

Xiangze Teng, Xiang Li, Benzheng Wei

{"title":"ModFus-PD: synergizing cross-modal attention and contrastive learning for enhanced multimodal diagnosis of Parkinson's disease.","authors":"Xiangze Teng, Xiang Li, Benzheng Wei","doi":"10.3389/fncom.2025.1604399","DOIUrl":null,"url":null,"abstract":"<p><p>Parkinson's disease (PD) is a complex neurodegenerative disorder characterized by a high rate of misdiagnosis, underscoring the critical importance of early and accurate diagnosis. Although existing computer-aided diagnostic systems integrate clinical assessment scales with neuroimaging data, they typically rely on superficial feature concatenation, which fails to capture the deep inter-modal dependencies essential for effective multimodal fusion. To address this limitation, we propose ModFus-PD, Contrastive learning effectively aligns heterogeneous modalities such as imaging and clinical text, while the cross-modal attention mechanism further exploits semantic interactions between them to enhance feature fusion. The framework comprises three key components: (1) a contrastive learning-based feature alignment module that projects MRI data and clinical text prompts into a unified embedding space via pretrained image and text encoders; (2) a bidirectional cross-modal attention module in which textual semantics guide MRI feature refinement for improved sensitivity to PD-related brain regions, while MRI features simultaneously enhance the contextual understanding of clinical text; (3) a hierarchical classification module that integrates the fused representations through two fully connected layers to produce final PD classification probabilities. Experiments on the PPMI dataset demonstrate the superior performance of ModFus-PD, achieving an accuracy of 0.903, AUC of 0.892, and F1 score of 0.840, surpassing several state-of-the-art baselines. These results validate the effectiveness of our cross-modal fusion strategy, which enables interpretable and reliable diagnostic support, holding promise for future clinical translation.</p>","PeriodicalId":12363,"journal":{"name":"Frontiers in Computational Neuroscience","volume":"19 ","pages":"1604399"},"PeriodicalIF":2.3000,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12307351/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Computational Neuroscience","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fncom.2025.1604399","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Parkinson's disease (PD) is a complex neurodegenerative disorder characterized by a high rate of misdiagnosis, underscoring the critical importance of early and accurate diagnosis. Although existing computer-aided diagnostic systems integrate clinical assessment scales with neuroimaging data, they typically rely on superficial feature concatenation, which fails to capture the deep inter-modal dependencies essential for effective multimodal fusion. To address this limitation, we propose ModFus-PD, Contrastive learning effectively aligns heterogeneous modalities such as imaging and clinical text, while the cross-modal attention mechanism further exploits semantic interactions between them to enhance feature fusion. The framework comprises three key components: (1) a contrastive learning-based feature alignment module that projects MRI data and clinical text prompts into a unified embedding space via pretrained image and text encoders; (2) a bidirectional cross-modal attention module in which textual semantics guide MRI feature refinement for improved sensitivity to PD-related brain regions, while MRI features simultaneously enhance the contextual understanding of clinical text; (3) a hierarchical classification module that integrates the fused representations through two fully connected layers to produce final PD classification probabilities. Experiments on the PPMI dataset demonstrate the superior performance of ModFus-PD, achieving an accuracy of 0.903, AUC of 0.892, and F1 score of 0.840, surpassing several state-of-the-art baselines. These results validate the effectiveness of our cross-modal fusion strategy, which enables interpretable and reliable diagnostic support, holding promise for future clinical translation.

查看原文本刊更多论文

ModFus-PD：协同跨模态注意和对比学习以增强帕金森病的多模态诊断

帕金森病（PD）是一种复杂的神经退行性疾病，其特点是误诊率高，强调了早期准确诊断的重要性。尽管现有的计算机辅助诊断系统将临床评估量表与神经影像学数据相结合，但它们通常依赖于表面特征拼接，无法捕获有效多模态融合所必需的深层多模态依赖关系。为了解决这一限制，我们提出ModFus-PD，对比学习有效地对齐异质模式，如成像和临床文本，而跨模态注意机制进一步利用它们之间的语义交互来增强特征融合。该框架包括三个关键部分：(1)基于对比学习的特征对齐模块，通过预训练的图像和文本编码器将MRI数据和临床文本提示投影到统一的嵌入空间；(2)双向跨模态注意模块，文本语义引导MRI特征细化，提高pd相关脑区敏感性，而MRI特征同时增强临床文本的语境理解；(3)分层分类模块，通过两个完全连通的层将融合后的表示进行整合，生成最终的PD分类概率。在PPMI数据集上的实验证明了ModFus-PD的优越性能，其精度为0.903，AUC为0.892，F1得分为0.840，超过了几种最先进的基线。这些结果验证了我们的跨模式融合策略的有效性，该策略提供了可解释和可靠的诊断支持，为未来的临床翻译带来了希望。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Frontiers in Computational Neuroscience MATHEMATICAL & COMPUTATIONAL BIOLOGY-NEUROSCIENCES

CiteScore

5.30

自引率

3.10%

发文量

166

审稿时长

6-12 weeks

期刊介绍： Frontiers in Computational Neuroscience is a first-tier electronic journal devoted to promoting theoretical modeling of brain function and fostering interdisciplinary interactions between theoretical and experimental neuroscience. Progress in understanding the amazing capabilities of the brain is still limited, and we believe that it will only come with deep theoretical thinking and mutually stimulating cooperation between different disciplines and approaches. We therefore invite original contributions on a wide range of topics that present the fruits of such cooperation, or provide stimuli for future alliances. We aim to provide an interactive forum for cutting-edge theoretical studies of the nervous system, and for promulgating the best theoretical research to the broader neuroscience community. Models of all styles and at all levels are welcome, from biophysically motivated realistic simulations of neurons and synapses to high-level abstract models of inference and decision making. While the journal is primarily focused on theoretically based and driven research, we welcome experimental studies that validate and test theoretical conclusions. Also: comp neuro