Path-GPTOmic:用于生存结果预测的平衡多模态学习框架

Hongxiao Wang, Yang Yang, Zhuo Zhao, Pengfei Gu, Nishchal Sapkota, Danny Z. Chen
{"title":"Path-GPTOmic:用于生存结果预测的平衡多模态学习框架","authors":"Hongxiao Wang, Yang Yang, Zhuo Zhao, Pengfei Gu, Nishchal Sapkota, Danny Z. Chen","doi":"arxiv-2403.11375","DOIUrl":null,"url":null,"abstract":"For predicting cancer survival outcomes, standard approaches in clinical\nresearch are often based on two main modalities: pathology images for observing\ncell morphology features, and genomic (e.g., bulk RNA-seq) for quantifying gene\nexpressions. However, existing pathology-genomic multi-modal algorithms face\nsignificant challenges: (1) Valuable biological insights regarding genes and\ngene-gene interactions are frequently overlooked; (2) one modality often\ndominates the optimization process, causing inadequate training for the other\nmodality. In this paper, we introduce a new multi-modal ``Path-GPTOmic\"\nframework for cancer survival outcome prediction. First, to extract valuable\nbiological insights, we regulate the embedding space of a foundation model,\nscGPT, initially trained on single-cell RNA-seq data, making it adaptable for\nbulk RNA-seq data. Second, to address the imbalance-between-modalities problem,\nwe propose a gradient modulation mechanism tailored to the Cox partial\nlikelihood loss for survival prediction. The contributions of the modalities\nare dynamically monitored and adjusted during the training process, encouraging\nthat both modalities are sufficiently trained. Evaluated on two TCGA(The Cancer\nGenome Atlas) datasets, our model achieves substantially improved survival\nprediction accuracy.","PeriodicalId":501070,"journal":{"name":"arXiv - QuanBio - Genomics","volume":"31 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Path-GPTOmic: A Balanced Multi-modal Learning Framework for Survival Outcome Prediction\",\"authors\":\"Hongxiao Wang, Yang Yang, Zhuo Zhao, Pengfei Gu, Nishchal Sapkota, Danny Z. Chen\",\"doi\":\"arxiv-2403.11375\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For predicting cancer survival outcomes, standard approaches in clinical\\nresearch are often based on two main modalities: pathology images for observing\\ncell morphology features, and genomic (e.g., bulk RNA-seq) for quantifying gene\\nexpressions. However, existing pathology-genomic multi-modal algorithms face\\nsignificant challenges: (1) Valuable biological insights regarding genes and\\ngene-gene interactions are frequently overlooked; (2) one modality often\\ndominates the optimization process, causing inadequate training for the other\\nmodality. In this paper, we introduce a new multi-modal ``Path-GPTOmic\\\"\\nframework for cancer survival outcome prediction. First, to extract valuable\\nbiological insights, we regulate the embedding space of a foundation model,\\nscGPT, initially trained on single-cell RNA-seq data, making it adaptable for\\nbulk RNA-seq data. Second, to address the imbalance-between-modalities problem,\\nwe propose a gradient modulation mechanism tailored to the Cox partial\\nlikelihood loss for survival prediction. The contributions of the modalities\\nare dynamically monitored and adjusted during the training process, encouraging\\nthat both modalities are sufficiently trained. Evaluated on two TCGA(The Cancer\\nGenome Atlas) datasets, our model achieves substantially improved survival\\nprediction accuracy.\",\"PeriodicalId\":501070,\"journal\":{\"name\":\"arXiv - QuanBio - Genomics\",\"volume\":\"31 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Genomics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2403.11375\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.11375","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

为预测癌症生存结果,临床研究中的标准方法通常基于两种主要模式:用于观察细胞形态特征的病理图像和用于量化基因表达的基因组学(如批量 RNA-seq)。然而,现有的病理-基因组多模态算法面临着重大挑战:(1)关于基因和基因-基因相互作用的宝贵生物学见解经常被忽视;(2)一种模态经常主导优化过程,导致另一种模态的训练不足。在本文中,我们为癌症生存结果预测引入了一种新的多模态 "Path-GPTOmic "框架。首先,为了提取有价值的生物学见解,我们调节了基础模型 scGPT 的嵌入空间,该模型最初是在单细胞 RNA-seq 数据上训练的,使其能够适应大量 RNA-seq 数据。其次,为了解决模态间的不平衡问题,我们提出了一种梯度调节机制,该机制是为生存预测的 Cox 部分似然损失量身定制的。在训练过程中,我们会动态监测和调整两种模态的贡献,以确保两种模态都得到充分训练。在两个TCGA(The CancerGenome Atlas)数据集上进行评估后,我们的模型大大提高了生存预测的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Path-GPTOmic: A Balanced Multi-modal Learning Framework for Survival Outcome Prediction
For predicting cancer survival outcomes, standard approaches in clinical research are often based on two main modalities: pathology images for observing cell morphology features, and genomic (e.g., bulk RNA-seq) for quantifying gene expressions. However, existing pathology-genomic multi-modal algorithms face significant challenges: (1) Valuable biological insights regarding genes and gene-gene interactions are frequently overlooked; (2) one modality often dominates the optimization process, causing inadequate training for the other modality. In this paper, we introduce a new multi-modal ``Path-GPTOmic" framework for cancer survival outcome prediction. First, to extract valuable biological insights, we regulate the embedding space of a foundation model, scGPT, initially trained on single-cell RNA-seq data, making it adaptable for bulk RNA-seq data. Second, to address the imbalance-between-modalities problem, we propose a gradient modulation mechanism tailored to the Cox partial likelihood loss for survival prediction. The contributions of the modalities are dynamically monitored and adjusted during the training process, encouraging that both modalities are sufficiently trained. Evaluated on two TCGA(The Cancer Genome Atlas) datasets, our model achieves substantially improved survival prediction accuracy.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信