MM-GTUNets: Unified Multi-Modal Graph Deep Learning for Brain Disorders Prediction

IEEE transactions on medical imaging Pub Date : 2025-04-01 DOI:10.1109/TMI.2025.3556420

Luhui Cai;Weiming Zeng;Hongyu Chen;Hua Zhang;Yueyang Li;Yu Feng;Hongjie Yan;Lingbin Bian;Wai Ting Siok;Nizhuan Wang

{"title":"MM-GTUNets: Unified Multi-Modal Graph Deep Learning for Brain Disorders Prediction","authors":"Luhui Cai;Weiming Zeng;Hongyu Chen;Hua Zhang;Yueyang Li;Yu Feng;Hongjie Yan;Lingbin Bian;Wai Ting Siok;Nizhuan Wang","doi":"10.1109/TMI.2025.3556420","DOIUrl":null,"url":null,"abstract":"Graph deep learning (GDL) has demonstrated impressive performance in predicting population-based brain disorders (BDs) through the integration of both imaging and non-imaging data. However, the effectiveness of GDL-based methods heavily depends on the quality of modeling multi-modal population graphs and tends to degrade as the graph scale increases. Moreover, these methods often limit interactions between imaging and non-imaging data to node-edge interactions within the graph, overlooking complex inter-modal correlations and resulting in suboptimal outcomes. To address these challenges, we propose MM-GTUNets, an end-to-end Graph Transformer-based multi-modal graph deep learning (MMGDL) framework designed for large-scale brain disorders prediction. To effectively utilize rich multi-modal disease-related information, we introduce <underline>Modality <underline>Reward <underline>Representation <underline>Learning (MRRL), which dynamically constructs population graphs using an Affinity Metric Reward System (AMRS). We also employ a variational autoencoder to reconstruct latent representations of non-imaging features aligned with imaging features. Based on this, we introduce <underline>Adaptive <underline>Cross-<underline>Modal <underline>Graph <underline>Learning (ACMGL), which captures critical modality-specific and modality-shared features through a unified GTUNet encoder, taking advantages of Graph UNet and Graph Transformer, along with a feature fusion module. We validated our method on two public multi-modal datasets ABIDE and ADHD-200, demonstrating its superior performance in diagnosing BDs. Our code is available at <uri>https://github.com/NZWANG/MM-GTUNets</uri><uri>https://github.com/NZWANG/MM-GTUNets</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 9","pages":"3705-3716"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10946209/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Graph deep learning (GDL) has demonstrated impressive performance in predicting population-based brain disorders (BDs) through the integration of both imaging and non-imaging data. However, the effectiveness of GDL-based methods heavily depends on the quality of modeling multi-modal population graphs and tends to degrade as the graph scale increases. Moreover, these methods often limit interactions between imaging and non-imaging data to node-edge interactions within the graph, overlooking complex inter-modal correlations and resulting in suboptimal outcomes. To address these challenges, we propose MM-GTUNets, an end-to-end Graph Transformer-based multi-modal graph deep learning (MMGDL) framework designed for large-scale brain disorders prediction. To effectively utilize rich multi-modal disease-related information, we introduce Modality Reward Representation Learning (MRRL), which dynamically constructs population graphs using an Affinity Metric Reward System (AMRS). We also employ a variational autoencoder to reconstruct latent representations of non-imaging features aligned with imaging features. Based on this, we introduce Adaptive Cross-Modal Graph Learning (ACMGL), which captures critical modality-specific and modality-shared features through a unified GTUNet encoder, taking advantages of Graph UNet and Graph Transformer, along with a feature fusion module. We validated our method on two public multi-modal datasets ABIDE and ADHD-200, demonstrating its superior performance in diagnosing BDs. Our code is available at https://github.com/NZWANG/MM-GTUNetshttps://github.com/NZWANG/MM-GTUNets

查看原文本刊更多论文

MM-GTUNets：用于脑疾病预测的统一多模态图深度学习

图深度学习（GDL）通过整合成像和非成像数据，在预测基于人群的大脑疾病（bd）方面表现出了令人印象深刻的表现。然而，基于gdl的方法的有效性在很大程度上取决于多模态人口图的建模质量，并且随着图规模的增加而降低。此外，这些方法通常将成像和非成像数据之间的相互作用限制在图中的节点-边缘相互作用，忽略了复杂的模态间相关性，导致次优结果。为了解决这些挑战，我们提出了MM-GTUNets，这是一个基于端到端图转换器的多模态图深度学习（MMGDL）框架，旨在用于大规模脑疾病预测。为了有效地利用丰富的多模态疾病相关信息，我们引入了模态奖励表示学习（MRRL），它使用亲和度量奖励系统（AMRS）动态构建种群图。我们还采用变分自编码器来重建与成像特征对齐的非成像特征的潜在表示。在此基础上，我们引入了自适应跨模态图学习（ACMGL），它通过统一的GTUNet编码器捕获关键的模态特定和模态共享特征，利用Graph UNet和Graph Transformer的优势，以及特征融合模块。我们在两个公共多模态数据集ABIDE和ADHD-200上验证了我们的方法，证明了它在诊断bd方面的优越性能。我们的代码可在https://github.com/NZWANG/MM-GTUNetshttps://github.com/NZWANG/MM-GTUNets上获得

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on medical imaging

自引率

0.00%

发文量