{"title":"Multimodal representation learning with hierarchical knowledge decomposition for cancer survival analysis","authors":"Xulin Yang , Hang Qiu","doi":"10.1016/j.neucom.2025.131053","DOIUrl":null,"url":null,"abstract":"<div><div>The accurate survival analysis of cancer patients serves as an important decision-making basis for formulating personalized treatment plans. Recent studies illustrate that integrating multimodal information including clinical diagnostic data, genomic features, and whole-slide images (WSIs) can significantly enhance the performance of prognostic assessment models. However, most existing multimodal survival analysis methods excessively focus on extracting shared inter-modal features while failing to effectively mine modality-specific biological information, resulting in inadequate acquisition of comprehensive patient multimodal representations. Furthermore, how to extract prognosis-related tissue microenvironment features from ultra-high-resolution WSIs remains an unresolved open challenge. To address these issues, a multimodal representation learning framework with hierarchical knowledge decomposition (MRL-HKD) is proposed for cancer survival analysis. MRL-HKD transforms multimodal representation learning into a set partitioning problem within multimodal knowledge spaces, and employs a hierarchical multimodal knowledge decomposition module to decouple complex inter-modal relationships. Meanwhile, to address the challenge of high-dimensional pathological image feature extraction, a gated attention mechanism-based multimodal patch attention network is designed. The performance comparison experiments on four cancer datasets demonstrate that MRL-HKD significantly outperforms state-of-the-art methods. Our study demonstrates the potential of gated attention mechanisms and hierarchical knowledge decomposition in multimodal survival analysis, and provides an effective tool for cancer prognosis prediction. The source code will be open-sourced at <span><span>https://github.com/yangxulin/MRL-HKD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"652 ","pages":"Article 131053"},"PeriodicalIF":5.5000,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225017254","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The accurate survival analysis of cancer patients serves as an important decision-making basis for formulating personalized treatment plans. Recent studies illustrate that integrating multimodal information including clinical diagnostic data, genomic features, and whole-slide images (WSIs) can significantly enhance the performance of prognostic assessment models. However, most existing multimodal survival analysis methods excessively focus on extracting shared inter-modal features while failing to effectively mine modality-specific biological information, resulting in inadequate acquisition of comprehensive patient multimodal representations. Furthermore, how to extract prognosis-related tissue microenvironment features from ultra-high-resolution WSIs remains an unresolved open challenge. To address these issues, a multimodal representation learning framework with hierarchical knowledge decomposition (MRL-HKD) is proposed for cancer survival analysis. MRL-HKD transforms multimodal representation learning into a set partitioning problem within multimodal knowledge spaces, and employs a hierarchical multimodal knowledge decomposition module to decouple complex inter-modal relationships. Meanwhile, to address the challenge of high-dimensional pathological image feature extraction, a gated attention mechanism-based multimodal patch attention network is designed. The performance comparison experiments on four cancer datasets demonstrate that MRL-HKD significantly outperforms state-of-the-art methods. Our study demonstrates the potential of gated attention mechanisms and hierarchical knowledge decomposition in multimodal survival analysis, and provides an effective tool for cancer prognosis prediction. The source code will be open-sourced at https://github.com/yangxulin/MRL-HKD.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.