Multimodal representation learning with hierarchical knowledge decomposition for cancer survival analysis

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing Pub Date : 2025-07-18 DOI:10.1016/j.neucom.2025.131053

Xulin Yang , Hang Qiu

{"title":"Multimodal representation learning with hierarchical knowledge decomposition for cancer survival analysis","authors":"Xulin Yang , Hang Qiu","doi":"10.1016/j.neucom.2025.131053","DOIUrl":null,"url":null,"abstract":"<div><div>The accurate survival analysis of cancer patients serves as an important decision-making basis for formulating personalized treatment plans. Recent studies illustrate that integrating multimodal information including clinical diagnostic data, genomic features, and whole-slide images (WSIs) can significantly enhance the performance of prognostic assessment models. However, most existing multimodal survival analysis methods excessively focus on extracting shared inter-modal features while failing to effectively mine modality-specific biological information, resulting in inadequate acquisition of comprehensive patient multimodal representations. Furthermore, how to extract prognosis-related tissue microenvironment features from ultra-high-resolution WSIs remains an unresolved open challenge. To address these issues, a multimodal representation learning framework with hierarchical knowledge decomposition (MRL-HKD) is proposed for cancer survival analysis. MRL-HKD transforms multimodal representation learning into a set partitioning problem within multimodal knowledge spaces, and employs a hierarchical multimodal knowledge decomposition module to decouple complex inter-modal relationships. Meanwhile, to address the challenge of high-dimensional pathological image feature extraction, a gated attention mechanism-based multimodal patch attention network is designed. The performance comparison experiments on four cancer datasets demonstrate that MRL-HKD significantly outperforms state-of-the-art methods. Our study demonstrates the potential of gated attention mechanisms and hierarchical knowledge decomposition in multimodal survival analysis, and provides an effective tool for cancer prognosis prediction. The source code will be open-sourced at <span><span>https://github.com/yangxulin/MRL-HKD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"652 ","pages":"Article 131053"},"PeriodicalIF":5.5000,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225017254","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The accurate survival analysis of cancer patients serves as an important decision-making basis for formulating personalized treatment plans. Recent studies illustrate that integrating multimodal information including clinical diagnostic data, genomic features, and whole-slide images (WSIs) can significantly enhance the performance of prognostic assessment models. However, most existing multimodal survival analysis methods excessively focus on extracting shared inter-modal features while failing to effectively mine modality-specific biological information, resulting in inadequate acquisition of comprehensive patient multimodal representations. Furthermore, how to extract prognosis-related tissue microenvironment features from ultra-high-resolution WSIs remains an unresolved open challenge. To address these issues, a multimodal representation learning framework with hierarchical knowledge decomposition (MRL-HKD) is proposed for cancer survival analysis. MRL-HKD transforms multimodal representation learning into a set partitioning problem within multimodal knowledge spaces, and employs a hierarchical multimodal knowledge decomposition module to decouple complex inter-modal relationships. Meanwhile, to address the challenge of high-dimensional pathological image feature extraction, a gated attention mechanism-based multimodal patch attention network is designed. The performance comparison experiments on four cancer datasets demonstrate that MRL-HKD significantly outperforms state-of-the-art methods. Our study demonstrates the potential of gated attention mechanisms and hierarchical knowledge decomposition in multimodal survival analysis, and provides an effective tool for cancer prognosis prediction. The source code will be open-sourced at https://github.com/yangxulin/MRL-HKD.

查看原文本刊更多论文

基于层次知识分解的多模态表示学习用于癌症生存分析

准确的癌症患者生存分析是制定个性化治疗方案的重要决策依据。最近的研究表明，整合包括临床诊断数据、基因组特征和全幻灯片图像（WSIs）在内的多模式信息可以显著提高预后评估模型的性能。然而，大多数现有的多模态生存分析方法过于注重提取共享的多模态特征，而未能有效挖掘特定模态的生物信息，导致无法获得全面的患者多模态表征。此外，如何从超高分辨率wsi中提取与预后相关的组织微环境特征仍然是一个未解决的开放挑战。为了解决这些问题，提出了一种具有层次知识分解的多模态表示学习框架（MRL-HKD）用于癌症生存分析。MRL-HKD将多模态表示学习转化为多模态知识空间内的集合划分问题，并采用层次化的多模态知识分解模块对复杂的多模态关系进行解耦。同时，为了解决高维病理图像特征提取的难题，设计了一种基于门控注意机制的多模态补丁注意网络。在四个癌症数据集上的性能比较实验表明，MRL-HKD显著优于最先进的方法。我们的研究证明了门控注意机制和分层知识分解在多模式生存分析中的潜力，并为癌症预后预测提供了有效的工具。源代码将在https://github.com/yangxulin/MRL-HKD上开放。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.