GeneCompass:利用知识型跨物种基础模型破译通用基因调控机制

IF 28.1 1区 生物学 Q1 CELL BIOLOGY
Xiaodong Yang, Guole Liu, Guihai Feng, Dechao Bu, Pengfei Wang, Jie Jiang, Shubai Chen, Qinmeng Yang, Hefan Miao, Yiyang Zhang, Zhenpeng Man, Zhongming Liang, Zichen Wang, Yaning Li, Zheng Li, Yana Liu, Yao Tian, Wenhao Liu, Cong Li, Ao Li, Jingxi Dong, Zhilong Hu, Chen Fang, Lina Cui, Zixu Deng, Haiping Jiang, Wentao Cui, Jiahao Zhang, Zhaohui Yang, Handong Li, Xingjian He, Liqun Zhong, Jiaheng Zhou, Zijian Wang, Qingqing Long, Ping Xu, Hongmei Wang, Zhen Meng, Xuezhi Wang, Yangang Wang, Yong Wang, Shihua Zhang, Jingtao Guo, Yi Zhao, Yuanchun Zhou, Fei Li, Jing Liu, Yiqiang Chen, Ge Yang, Xin Li
{"title":"GeneCompass:利用知识型跨物种基础模型破译通用基因调控机制","authors":"Xiaodong Yang, Guole Liu, Guihai Feng, Dechao Bu, Pengfei Wang, Jie Jiang, Shubai Chen, Qinmeng Yang, Hefan Miao, Yiyang Zhang, Zhenpeng Man, Zhongming Liang, Zichen Wang, Yaning Li, Zheng Li, Yana Liu, Yao Tian, Wenhao Liu, Cong Li, Ao Li, Jingxi Dong, Zhilong Hu, Chen Fang, Lina Cui, Zixu Deng, Haiping Jiang, Wentao Cui, Jiahao Zhang, Zhaohui Yang, Handong Li, Xingjian He, Liqun Zhong, Jiaheng Zhou, Zijian Wang, Qingqing Long, Ping Xu, Hongmei Wang, Zhen Meng, Xuezhi Wang, Yangang Wang, Yong Wang, Shihua Zhang, Jingtao Guo, Yi Zhao, Yuanchun Zhou, Fei Li, Jing Liu, Yiqiang Chen, Ge Yang, Xin Li","doi":"10.1038/s41422-024-01034-y","DOIUrl":null,"url":null,"abstract":"<p>Deciphering universal gene regulatory mechanisms in diverse organisms holds great potential for advancing our knowledge of fundamental life processes and facilitating clinical applications. However, the traditional research paradigm primarily focuses on individual model organisms and does not integrate various cell types across species. Recent breakthroughs in single-cell sequencing and deep learning techniques present an unprecedented opportunity to address this challenge. In this study, we built an extensive dataset of over 120 million human and mouse single-cell transcriptomes. After data preprocessing, we obtained 101,768,420 single-cell transcriptomes and developed a knowledge-informed cross-species foundation model, named GeneCompass. During pre-training, GeneCompass effectively integrated four types of prior biological knowledge to enhance our understanding of gene regulatory mechanisms in a self-supervised manner. By fine-tuning for multiple downstream tasks, GeneCompass outperformed state-of-the-art models in diverse applications for a single species and unlocked new realms of cross-species biological investigations. We also employed GeneCompass to search for key factors associated with cell fate transition and showed that the predicted candidate genes could successfully induce the differentiation of human embryonic stem cells into the gonadal fate. Overall, GeneCompass demonstrates the advantages of using artificial intelligence technology to decipher universal gene regulatory mechanisms and shows tremendous potential for accelerating the discovery of critical cell fate regulators and candidate drug targets.</p>","PeriodicalId":9926,"journal":{"name":"Cell Research","volume":null,"pages":null},"PeriodicalIF":28.1000,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GeneCompass: deciphering universal gene regulatory mechanisms with a knowledge-informed cross-species foundation model\",\"authors\":\"Xiaodong Yang, Guole Liu, Guihai Feng, Dechao Bu, Pengfei Wang, Jie Jiang, Shubai Chen, Qinmeng Yang, Hefan Miao, Yiyang Zhang, Zhenpeng Man, Zhongming Liang, Zichen Wang, Yaning Li, Zheng Li, Yana Liu, Yao Tian, Wenhao Liu, Cong Li, Ao Li, Jingxi Dong, Zhilong Hu, Chen Fang, Lina Cui, Zixu Deng, Haiping Jiang, Wentao Cui, Jiahao Zhang, Zhaohui Yang, Handong Li, Xingjian He, Liqun Zhong, Jiaheng Zhou, Zijian Wang, Qingqing Long, Ping Xu, Hongmei Wang, Zhen Meng, Xuezhi Wang, Yangang Wang, Yong Wang, Shihua Zhang, Jingtao Guo, Yi Zhao, Yuanchun Zhou, Fei Li, Jing Liu, Yiqiang Chen, Ge Yang, Xin Li\",\"doi\":\"10.1038/s41422-024-01034-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Deciphering universal gene regulatory mechanisms in diverse organisms holds great potential for advancing our knowledge of fundamental life processes and facilitating clinical applications. However, the traditional research paradigm primarily focuses on individual model organisms and does not integrate various cell types across species. Recent breakthroughs in single-cell sequencing and deep learning techniques present an unprecedented opportunity to address this challenge. In this study, we built an extensive dataset of over 120 million human and mouse single-cell transcriptomes. After data preprocessing, we obtained 101,768,420 single-cell transcriptomes and developed a knowledge-informed cross-species foundation model, named GeneCompass. During pre-training, GeneCompass effectively integrated four types of prior biological knowledge to enhance our understanding of gene regulatory mechanisms in a self-supervised manner. By fine-tuning for multiple downstream tasks, GeneCompass outperformed state-of-the-art models in diverse applications for a single species and unlocked new realms of cross-species biological investigations. We also employed GeneCompass to search for key factors associated with cell fate transition and showed that the predicted candidate genes could successfully induce the differentiation of human embryonic stem cells into the gonadal fate. Overall, GeneCompass demonstrates the advantages of using artificial intelligence technology to decipher universal gene regulatory mechanisms and shows tremendous potential for accelerating the discovery of critical cell fate regulators and candidate drug targets.</p>\",\"PeriodicalId\":9926,\"journal\":{\"name\":\"Cell Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":28.1000,\"publicationDate\":\"2024-10-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cell Research\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1038/s41422-024-01034-y\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CELL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1038/s41422-024-01034-y","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

破译不同生物体中的通用基因调控机制,对于增进我们对基本生命过程的了解和促进临床应用具有巨大潜力。然而,传统的研究范式主要关注单个模式生物,并没有整合不同物种的各种细胞类型。单细胞测序和深度学习技术的最新突破为应对这一挑战提供了前所未有的机遇。在这项研究中,我们建立了一个包含超过 1.2 亿个人类和小鼠单细胞转录组的广泛数据集。经过数据预处理后,我们获得了 101,768,420 个单细胞转录组,并开发了一个基于知识的跨物种基础模型,命名为 GeneCompass。在预训练过程中,GeneCompass 有效地整合了四种先验生物学知识,以自我监督的方式增强了我们对基因调控机制的理解。通过对多个下游任务进行微调,GeneCompass 在单一物种的各种应用中表现优于最先进的模型,并开启了跨物种生物研究的新领域。我们还利用 GeneCompass 搜索与细胞命运转变相关的关键因素,结果表明预测的候选基因能成功诱导人类胚胎干细胞向性腺命运分化。总之,GeneCompass 展示了利用人工智能技术破译通用基因调控机制的优势,并显示了加速发现关键细胞命运调控因子和候选药物靶点的巨大潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

GeneCompass: deciphering universal gene regulatory mechanisms with a knowledge-informed cross-species foundation model

GeneCompass: deciphering universal gene regulatory mechanisms with a knowledge-informed cross-species foundation model

Deciphering universal gene regulatory mechanisms in diverse organisms holds great potential for advancing our knowledge of fundamental life processes and facilitating clinical applications. However, the traditional research paradigm primarily focuses on individual model organisms and does not integrate various cell types across species. Recent breakthroughs in single-cell sequencing and deep learning techniques present an unprecedented opportunity to address this challenge. In this study, we built an extensive dataset of over 120 million human and mouse single-cell transcriptomes. After data preprocessing, we obtained 101,768,420 single-cell transcriptomes and developed a knowledge-informed cross-species foundation model, named GeneCompass. During pre-training, GeneCompass effectively integrated four types of prior biological knowledge to enhance our understanding of gene regulatory mechanisms in a self-supervised manner. By fine-tuning for multiple downstream tasks, GeneCompass outperformed state-of-the-art models in diverse applications for a single species and unlocked new realms of cross-species biological investigations. We also employed GeneCompass to search for key factors associated with cell fate transition and showed that the predicted candidate genes could successfully induce the differentiation of human embryonic stem cells into the gonadal fate. Overall, GeneCompass demonstrates the advantages of using artificial intelligence technology to decipher universal gene regulatory mechanisms and shows tremendous potential for accelerating the discovery of critical cell fate regulators and candidate drug targets.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Cell Research
Cell Research 生物-细胞生物学
CiteScore
53.90
自引率
0.70%
发文量
2420
审稿时长
2.3 months
期刊介绍: Cell Research (CR) is an international journal published by Springer Nature in partnership with the Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences (CAS). It focuses on publishing original research articles and reviews in various areas of life sciences, particularly those related to molecular and cell biology. The journal covers a broad range of topics including cell growth, differentiation, and apoptosis; signal transduction; stem cell biology and development; chromatin, epigenetics, and transcription; RNA biology; structural and molecular biology; cancer biology and metabolism; immunity and molecular pathogenesis; molecular and cellular neuroscience; plant molecular and cell biology; and omics, system biology, and synthetic biology. CR is recognized as China's best international journal in life sciences and is part of Springer Nature's prestigious family of Molecular Cell Biology journals.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信