Dominic Flack, Aakash Tripathi, Asim Waqas, Ghulam Rasool, Dimah Dera
{"title":"Robust Multimodal Fusion for Survival Prediction in Cancer Patients.","authors":"Dominic Flack, Aakash Tripathi, Asim Waqas, Ghulam Rasool, Dimah Dera","doi":"10.1177/11769351251376192","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Multimodal deep learning models have the potential to significantly improve survival predictions and treatment planning for cancer patients. These models integrate diverse data modalities using early, intermediate, or late fusion techniques. However, many existing multimodal models either underperform or show only marginal improvements over unimodal models. To establish the true efficacy of multimodal survival prediction models, it is critical to demonstrate consistent and substantial advantages over unimodal counterparts.</p><p><strong>Methods: </strong>In this paper, we introduce the Robust Multimodal Survival Model (RMSurv), a novel discrete late fusion model that leverages synthetic data generation to compute time-dependent weights for various modalities. RMSurv utilizes up to 6 distinct data modalities from The Cancer Genome Atlas Program (TCGA) non-small cell lung cancer and the TCGA pan-cancer datasets to predict overall survival over a period of 10 years. The key innovations of RMSurv are the calculation of time-dependent late fusion weights using a synthetically generated dataset and a new statistical feature normalization technique to enhance the interpretability and accuracy of discrete survival predictions. We evaluate the performance of the proposed method and several alternatives with cross validation using the concordance index, and vary the number of modalities included. We also create a late fusion simulation to highlight the complex relationships of multimodal fusion.</p><p><strong>Results: </strong>In our experiments, RMSurv outperforms the best unimodal model's Concordance index (C-Index) by 0.0273 on the 6-modal TCGA Lung Adenocarcinoma (LUAD) dataset. Existing late and early fusion methods improved the C-index by only 0.0143 and 0.0072, respectively. RMSurv also performs best on the combined TCGA non-small-cell lung cancer dataset and the TCGA pan-cancer dataset.</p><p><strong>Conclusions: </strong>These advancements underscore RMSurv's potential as a powerful approach for survival prediction, establishing robust multimodal benefits, and setting a new benchmark for survival prediction models in pan-cancer settings.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"24 ","pages":"11769351251376192"},"PeriodicalIF":2.5000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12476512/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/11769351251376192","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: Multimodal deep learning models have the potential to significantly improve survival predictions and treatment planning for cancer patients. These models integrate diverse data modalities using early, intermediate, or late fusion techniques. However, many existing multimodal models either underperform or show only marginal improvements over unimodal models. To establish the true efficacy of multimodal survival prediction models, it is critical to demonstrate consistent and substantial advantages over unimodal counterparts.
Methods: In this paper, we introduce the Robust Multimodal Survival Model (RMSurv), a novel discrete late fusion model that leverages synthetic data generation to compute time-dependent weights for various modalities. RMSurv utilizes up to 6 distinct data modalities from The Cancer Genome Atlas Program (TCGA) non-small cell lung cancer and the TCGA pan-cancer datasets to predict overall survival over a period of 10 years. The key innovations of RMSurv are the calculation of time-dependent late fusion weights using a synthetically generated dataset and a new statistical feature normalization technique to enhance the interpretability and accuracy of discrete survival predictions. We evaluate the performance of the proposed method and several alternatives with cross validation using the concordance index, and vary the number of modalities included. We also create a late fusion simulation to highlight the complex relationships of multimodal fusion.
Results: In our experiments, RMSurv outperforms the best unimodal model's Concordance index (C-Index) by 0.0273 on the 6-modal TCGA Lung Adenocarcinoma (LUAD) dataset. Existing late and early fusion methods improved the C-index by only 0.0143 and 0.0072, respectively. RMSurv also performs best on the combined TCGA non-small-cell lung cancer dataset and the TCGA pan-cancer dataset.
Conclusions: These advancements underscore RMSurv's potential as a powerful approach for survival prediction, establishing robust multimodal benefits, and setting a new benchmark for survival prediction models in pan-cancer settings.
期刊介绍:
The field of cancer research relies on advances in many other disciplines, including omics technology, mass spectrometry, radio imaging, computer science, and biostatistics. Cancer Informatics provides open access to peer-reviewed high-quality manuscripts reporting bioinformatics analysis of molecular genetics and/or clinical data pertaining to cancer, emphasizing the use of machine learning, artificial intelligence, statistical algorithms, advanced imaging techniques, data visualization, and high-throughput technologies. As the leading journal dedicated exclusively to the report of the use of computational methods in cancer research and practice, Cancer Informatics leverages methodological improvements in systems biology, genomics, proteomics, metabolomics, and molecular biochemistry into the fields of cancer detection, treatment, classification, risk-prediction, prevention, outcome, and modeling.