Minghao Han , Dingkang Yang , Jiabei Cheng , Xukun Zhang , Zizhi Chen , Haopeng Kuang , Lihua Zhang
{"title":"通过整合空间转录组学实现统一的分子增强病理图像表示学习","authors":"Minghao Han , Dingkang Yang , Jiabei Cheng , Xukun Zhang , Zizhi Chen , Haopeng Kuang , Lihua Zhang","doi":"10.1016/j.patcog.2025.112458","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advancements in multimodal pre-training have advanced computational pathology, but current visual-language approaches lack molecular perspective and face performance bottlenecks in clinical settings. Here, we introduce a <strong>U</strong>nified <strong>M</strong>olecule-enhanced <strong>P</strong>athology <strong>I</strong>mage <strong>RE</strong>presentation Learning framework (<span><math><mtext>UMPIRE</mtext></math></span>) that enhances the robustness and generalization capabilities of pathology image analysis across diverse tissue types and sequencing platforms. <span><math><mtext>UMPIRE</mtext></math></span> leverages complementary information from gene expression profiles to guide multimodal pre-training, addressing the challenge of distribution shifts between research and clinical environments. To overcome the scarcity of paired data, we collected more than 4 million entries of spatial transcriptomics gene expression to train the gene encoder. <span><math><mtext>UMPIRE</mtext></math></span> aligns modalities across 697K pathology image-gene expression pairs, creating a foundation model that demonstrates superior generalization across multiple sequencing platforms and downstream tasks without additional fine-tuning. Comprehensive evaluation shows <span><math><mtext>UMPIRE</mtext></math></span>’s effectiveness in gene expression prediction, spot classification, and mutation state prediction in whole slide images, with significant improvements over state-of-the-art methods. Our findings demonstrate how molecular data integration enhances visual pattern recognition in computational pathology, providing a resilient approach for bench-to-bedside translation. The code and pre-trained weights are available at <span><span>https://github.com/Hanminghao/Umpire</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112458"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards unified molecule-enhanced pathology image representation learning via integrating spatial transcriptomics\",\"authors\":\"Minghao Han , Dingkang Yang , Jiabei Cheng , Xukun Zhang , Zizhi Chen , Haopeng Kuang , Lihua Zhang\",\"doi\":\"10.1016/j.patcog.2025.112458\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recent advancements in multimodal pre-training have advanced computational pathology, but current visual-language approaches lack molecular perspective and face performance bottlenecks in clinical settings. Here, we introduce a <strong>U</strong>nified <strong>M</strong>olecule-enhanced <strong>P</strong>athology <strong>I</strong>mage <strong>RE</strong>presentation Learning framework (<span><math><mtext>UMPIRE</mtext></math></span>) that enhances the robustness and generalization capabilities of pathology image analysis across diverse tissue types and sequencing platforms. <span><math><mtext>UMPIRE</mtext></math></span> leverages complementary information from gene expression profiles to guide multimodal pre-training, addressing the challenge of distribution shifts between research and clinical environments. To overcome the scarcity of paired data, we collected more than 4 million entries of spatial transcriptomics gene expression to train the gene encoder. <span><math><mtext>UMPIRE</mtext></math></span> aligns modalities across 697K pathology image-gene expression pairs, creating a foundation model that demonstrates superior generalization across multiple sequencing platforms and downstream tasks without additional fine-tuning. Comprehensive evaluation shows <span><math><mtext>UMPIRE</mtext></math></span>’s effectiveness in gene expression prediction, spot classification, and mutation state prediction in whole slide images, with significant improvements over state-of-the-art methods. Our findings demonstrate how molecular data integration enhances visual pattern recognition in computational pathology, providing a resilient approach for bench-to-bedside translation. The code and pre-trained weights are available at <span><span>https://github.com/Hanminghao/Umpire</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"172 \",\"pages\":\"Article 112458\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320325011215\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325011215","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Towards unified molecule-enhanced pathology image representation learning via integrating spatial transcriptomics
Recent advancements in multimodal pre-training have advanced computational pathology, but current visual-language approaches lack molecular perspective and face performance bottlenecks in clinical settings. Here, we introduce a Unified Molecule-enhanced Pathology Image REpresentation Learning framework () that enhances the robustness and generalization capabilities of pathology image analysis across diverse tissue types and sequencing platforms. leverages complementary information from gene expression profiles to guide multimodal pre-training, addressing the challenge of distribution shifts between research and clinical environments. To overcome the scarcity of paired data, we collected more than 4 million entries of spatial transcriptomics gene expression to train the gene encoder. aligns modalities across 697K pathology image-gene expression pairs, creating a foundation model that demonstrates superior generalization across multiple sequencing platforms and downstream tasks without additional fine-tuning. Comprehensive evaluation shows ’s effectiveness in gene expression prediction, spot classification, and mutation state prediction in whole slide images, with significant improvements over state-of-the-art methods. Our findings demonstrate how molecular data integration enhances visual pattern recognition in computational pathology, providing a resilient approach for bench-to-bedside translation. The code and pre-trained weights are available at https://github.com/Hanminghao/Umpire.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.