Histology-Based Virtual RNA Inference Identifies Pathways Associated with Metastasis Risk in Colorectal Cancer.

Gokul Srinivasan, Minh-Khang Le, Zarif Azher, Xiaoying Liu, Louis Vaickus, Harsimran Kaur, Fred Kolling, Scott Palisoul, Laurent Perreard, Ken S Lau, Keluo Yao, Joshua Levy
{"title":"Histology-Based Virtual RNA Inference Identifies Pathways Associated with Metastasis Risk in Colorectal Cancer.","authors":"Gokul Srinivasan, Minh-Khang Le, Zarif Azher, Xiaoying Liu, Louis Vaickus, Harsimran Kaur, Fred Kolling, Scott Palisoul, Laurent Perreard, Ken S Lau, Keluo Yao, Joshua Levy","doi":"10.1101/2025.04.22.25326170","DOIUrl":null,"url":null,"abstract":"<p><p>Colorectal cancer (CRC) remains a major health concern, with over 150,000 new diagnoses and more than 50,000 deaths annually in the United States, underscoring an urgent need for improved screening, prognostication, disease management, and therapeutic approaches. The tumor microenvironment (TME)-comprising cancerous and immune cells interacting within the tumor's spatial architecture-plays a critical role in disease progression and treatment outcomes, reinforcing its importance as a prognostic marker for metastasis and recurrence risk. However, traditional methods for TME characterization, such as bulk transcriptomics and multiplex protein assays, lack sufficient spatial resolution. Although spatial transcriptomics (ST) allows for the high-resolution mapping of whole transcriptomes at near-cellular resolution, current ST technologies (e.g., Visium, Xenium) are limited by high costs, low throughput, and issues with reproducibility, preventing their widespread application in large-scale molecular epidemiology studies. In this study, we refined and implemented Virtual RNA Inference (VRI) to derive ST-level molecular information directly from hematoxylin and eosin (H&E)-stained tissue images. Our VRI models were trained on the largest matched CRC ST dataset to date, comprising 45 patients and more than 300,000 Visium spots from primary tumors. Using state-of-the-art architectures (UNI, ResNet-50, ViT, and VMamba), we achieved a median Spearman's correlation coefficient of 0.546 between predicted and measured spot-level expression. As validation, VRI-derived gene signatures linked to specific tissue regions (tumor, interface, submucosa, stroma, serosa, muscularis, inflammation) showed strong concordance with signatures generated via direct ST, and VRI performed accurately in estimating cell-type proportions spatially from H&E slides. In an expanded CRC cohort controlling for tumor invasiveness and clinical factors, we further identified VRI-derived gene signatures significantly associated with key prognostic outcomes, including metastasis status. Although certain tumor-related pathways are not fully captured by histology alone, our findings highlight the ability of VRI to infer a wide range of \"histology-associated\" biological pathways at near-cellular resolution without requiring ST profiling. Future efforts will extend this framework to expand TME phenotyping from standard H&E tissue images, with the potential to accelerate translational CRC research at scale.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12045403/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv : the preprint server for health sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2025.04.22.25326170","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Colorectal cancer (CRC) remains a major health concern, with over 150,000 new diagnoses and more than 50,000 deaths annually in the United States, underscoring an urgent need for improved screening, prognostication, disease management, and therapeutic approaches. The tumor microenvironment (TME)-comprising cancerous and immune cells interacting within the tumor's spatial architecture-plays a critical role in disease progression and treatment outcomes, reinforcing its importance as a prognostic marker for metastasis and recurrence risk. However, traditional methods for TME characterization, such as bulk transcriptomics and multiplex protein assays, lack sufficient spatial resolution. Although spatial transcriptomics (ST) allows for the high-resolution mapping of whole transcriptomes at near-cellular resolution, current ST technologies (e.g., Visium, Xenium) are limited by high costs, low throughput, and issues with reproducibility, preventing their widespread application in large-scale molecular epidemiology studies. In this study, we refined and implemented Virtual RNA Inference (VRI) to derive ST-level molecular information directly from hematoxylin and eosin (H&E)-stained tissue images. Our VRI models were trained on the largest matched CRC ST dataset to date, comprising 45 patients and more than 300,000 Visium spots from primary tumors. Using state-of-the-art architectures (UNI, ResNet-50, ViT, and VMamba), we achieved a median Spearman's correlation coefficient of 0.546 between predicted and measured spot-level expression. As validation, VRI-derived gene signatures linked to specific tissue regions (tumor, interface, submucosa, stroma, serosa, muscularis, inflammation) showed strong concordance with signatures generated via direct ST, and VRI performed accurately in estimating cell-type proportions spatially from H&E slides. In an expanded CRC cohort controlling for tumor invasiveness and clinical factors, we further identified VRI-derived gene signatures significantly associated with key prognostic outcomes, including metastasis status. Although certain tumor-related pathways are not fully captured by histology alone, our findings highlight the ability of VRI to infer a wide range of "histology-associated" biological pathways at near-cellular resolution without requiring ST profiling. Future efforts will extend this framework to expand TME phenotyping from standard H&E tissue images, with the potential to accelerate translational CRC research at scale.

基于组织学的虚拟RNA推断识别结直肠癌转移风险相关途径
结直肠癌(CRC)仍然是一个主要的健康问题,在美国每年有超过15万的新诊断和超过5万的死亡,强调了迫切需要改进筛查、预后、疾病管理和治疗方法。肿瘤微环境(tumor microenvironment, TME)——由肿瘤细胞和免疫细胞在肿瘤的空间结构中相互作用组成——在疾病进展和治疗结果中起着关键作用,强化了其作为转移和复发风险的预后标志物的重要性。然而,传统的TME表征方法,如批量转录组学和多重蛋白测定,缺乏足够的空间分辨率。尽管空间转录组学(ST)允许在近细胞分辨率下绘制全转录组的高分辨率图谱,但目前的ST技术(例如Visium, Xenium)受到高成本、低通量和可重复性问题的限制,阻碍了它们在大规模分子流行病学研究中的广泛应用。在这项研究中,我们改进并实现了虚拟RNA推断(VRI),直接从苏木精和伊红(H&E)染色的组织图像中获得st水平的分子信息。我们的VRI模型是在迄今为止最大的匹配CRC ST数据集上进行训练的,该数据集包括45名患者和来自原发肿瘤的超过30万个Visium斑点。使用最先进的架构(UNI, ResNet-50, ViT和vamba),我们在预测和测量的点水平表达之间获得了0.546的中位数Spearman相关系数。作为验证,VRI衍生的与特定组织区域(肿瘤、界面、粘膜下层、基质、浆膜、肌层、炎症)相关的基因特征与通过直接ST生成的特征具有很强的一致性,并且VRI在H&E玻片中准确地估计了细胞类型的空间比例。在一个控制肿瘤侵袭性和临床因素的扩大的CRC队列中,我们进一步确定了与关键预后(包括转移状态)显著相关的vri衍生基因特征。虽然某些肿瘤相关通路不能完全被组织学单独捕获,但我们的研究结果强调了VRI在近细胞分辨率下推断广泛的“组织学相关”生物学通路的能力,而不需要ST谱分析。未来的努力将扩展这一框架,从标准的H&E组织图像扩展TME表型,有可能加速大规模的CRC转译研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信