{"title":"MM-GTUNets:用于脑疾病预测的统一多模态图深度学习","authors":"Luhui Cai;Weiming Zeng;Hongyu Chen;Hua Zhang;Yueyang Li;Yu Feng;Hongjie Yan;Lingbin Bian;Wai Ting Siok;Nizhuan Wang","doi":"10.1109/TMI.2025.3556420","DOIUrl":null,"url":null,"abstract":"Graph deep learning (GDL) has demonstrated impressive performance in predicting population-based brain disorders (BDs) through the integration of both imaging and non-imaging data. However, the effectiveness of GDL-based methods heavily depends on the quality of modeling multi-modal population graphs and tends to degrade as the graph scale increases. Moreover, these methods often limit interactions between imaging and non-imaging data to node-edge interactions within the graph, overlooking complex inter-modal correlations and resulting in suboptimal outcomes. To address these challenges, we propose MM-GTUNets, an end-to-end Graph Transformer-based multi-modal graph deep learning (MMGDL) framework designed for large-scale brain disorders prediction. To effectively utilize rich multi-modal disease-related information, we introduce <underline>M</u>odality <underline>R</u>eward <underline>R</u>epresentation <underline>L</u>earning (MRRL), which dynamically constructs population graphs using an Affinity Metric Reward System (AMRS). We also employ a variational autoencoder to reconstruct latent representations of non-imaging features aligned with imaging features. Based on this, we introduce <underline>A</u>daptive <underline>C</u>ross-<underline>M</u>odal <underline>G</u>raph <underline>L</u>earning (ACMGL), which captures critical modality-specific and modality-shared features through a unified GTUNet encoder, taking advantages of Graph UNet and Graph Transformer, along with a feature fusion module. We validated our method on two public multi-modal datasets ABIDE and ADHD-200, demonstrating its superior performance in diagnosing BDs. Our code is available at <uri>https://github.com/NZWANG/MM-GTUNets</uri><uri>https://github.com/NZWANG/MM-GTUNets</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 9","pages":"3705-3716"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MM-GTUNets: Unified Multi-Modal Graph Deep Learning for Brain Disorders Prediction\",\"authors\":\"Luhui Cai;Weiming Zeng;Hongyu Chen;Hua Zhang;Yueyang Li;Yu Feng;Hongjie Yan;Lingbin Bian;Wai Ting Siok;Nizhuan Wang\",\"doi\":\"10.1109/TMI.2025.3556420\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph deep learning (GDL) has demonstrated impressive performance in predicting population-based brain disorders (BDs) through the integration of both imaging and non-imaging data. However, the effectiveness of GDL-based methods heavily depends on the quality of modeling multi-modal population graphs and tends to degrade as the graph scale increases. Moreover, these methods often limit interactions between imaging and non-imaging data to node-edge interactions within the graph, overlooking complex inter-modal correlations and resulting in suboptimal outcomes. To address these challenges, we propose MM-GTUNets, an end-to-end Graph Transformer-based multi-modal graph deep learning (MMGDL) framework designed for large-scale brain disorders prediction. To effectively utilize rich multi-modal disease-related information, we introduce <underline>M</u>odality <underline>R</u>eward <underline>R</u>epresentation <underline>L</u>earning (MRRL), which dynamically constructs population graphs using an Affinity Metric Reward System (AMRS). We also employ a variational autoencoder to reconstruct latent representations of non-imaging features aligned with imaging features. Based on this, we introduce <underline>A</u>daptive <underline>C</u>ross-<underline>M</u>odal <underline>G</u>raph <underline>L</u>earning (ACMGL), which captures critical modality-specific and modality-shared features through a unified GTUNet encoder, taking advantages of Graph UNet and Graph Transformer, along with a feature fusion module. We validated our method on two public multi-modal datasets ABIDE and ADHD-200, demonstrating its superior performance in diagnosing BDs. Our code is available at <uri>https://github.com/NZWANG/MM-GTUNets</uri><uri>https://github.com/NZWANG/MM-GTUNets</uri>\",\"PeriodicalId\":94033,\"journal\":{\"name\":\"IEEE transactions on medical imaging\",\"volume\":\"44 9\",\"pages\":\"3705-3716\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on medical imaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10946209/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10946209/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MM-GTUNets: Unified Multi-Modal Graph Deep Learning for Brain Disorders Prediction
Graph deep learning (GDL) has demonstrated impressive performance in predicting population-based brain disorders (BDs) through the integration of both imaging and non-imaging data. However, the effectiveness of GDL-based methods heavily depends on the quality of modeling multi-modal population graphs and tends to degrade as the graph scale increases. Moreover, these methods often limit interactions between imaging and non-imaging data to node-edge interactions within the graph, overlooking complex inter-modal correlations and resulting in suboptimal outcomes. To address these challenges, we propose MM-GTUNets, an end-to-end Graph Transformer-based multi-modal graph deep learning (MMGDL) framework designed for large-scale brain disorders prediction. To effectively utilize rich multi-modal disease-related information, we introduce Modality Reward Representation Learning (MRRL), which dynamically constructs population graphs using an Affinity Metric Reward System (AMRS). We also employ a variational autoencoder to reconstruct latent representations of non-imaging features aligned with imaging features. Based on this, we introduce Adaptive Cross-Modal Graph Learning (ACMGL), which captures critical modality-specific and modality-shared features through a unified GTUNet encoder, taking advantages of Graph UNet and Graph Transformer, along with a feature fusion module. We validated our method on two public multi-modal datasets ABIDE and ADHD-200, demonstrating its superior performance in diagnosing BDs. Our code is available at https://github.com/NZWANG/MM-GTUNetshttps://github.com/NZWANG/MM-GTUNets