利用gPRINT跨异构数据集对人类疾病进行基于基因打印的细胞亚型注释。

IF 13.6 1区 生物学 Q1 CELL BIOLOGY
Ruojin Yan, Chunmei Fan, Shen Gu, Tingzhang Wang, Zi Yin, Xiao Chen
{"title":"利用gPRINT跨异构数据集对人类疾病进行基于基因打印的细胞亚型注释。","authors":"Ruojin Yan, Chunmei Fan, Shen Gu, Tingzhang Wang, Zi Yin, Xiao Chen","doi":"10.1093/procel/pwaf001","DOIUrl":null,"url":null,"abstract":"<p><p>Identification of disease-specific cell subtypes (DSCSs) has profound implications for understanding disease mechanisms, preoperative diagnosis, and precision therapy. However, achieving unified annotation of DSCSs in heterogeneous single-cell datasets remains a challenge. In this study, we developed the gPRINT algorithm (generalized approach for cell subtype Identification with single cell's voicePRINT). Inspired by the principles of speech recognition in noisy environments, gPRINT transforms gene position and gene expression information into voiceprints based on ordered and clustered gene expression phenomena, obtaining unique \"gene print\" patterns for each cell. Then, we integrated neural networks to mitigate the impact of background noise on cell identity label mapping. We demonstrated the reproducibility of gPRINT across different donors, single-cell sequencing platforms, and disease subtypes, and its utility for automatic cell subtype annotation across datasets. Moreover, gPRINT achieved higher annotation accuracy of 98.37% when externally validated based on the same tissue, surpassing other algorithms. Furthermore, this approach has been applied to fibrosis-associated diseases in multiple tissues throughout the body, as well as to the annotation of fibroblast subtypes in a single tissue, tendon, where fibrosis is prevalent. We successfully achieved automatic prediction of tendinopathy-specific cell subtypes, key targets, and related drugs. In summary, gPRINT provides an automated and unified approach for identifying DSCSs across datasets, facilitating the elucidation of specific cell subtypes under different disease states and providing a powerful tool for exploring therapeutic targets in diseases.</p>","PeriodicalId":20790,"journal":{"name":"Protein & Cell","volume":" ","pages":""},"PeriodicalIF":13.6000,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Gene print-based cell subtypes annotation of human disease across heterogeneous datasets with gPRINT.\",\"authors\":\"Ruojin Yan, Chunmei Fan, Shen Gu, Tingzhang Wang, Zi Yin, Xiao Chen\",\"doi\":\"10.1093/procel/pwaf001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Identification of disease-specific cell subtypes (DSCSs) has profound implications for understanding disease mechanisms, preoperative diagnosis, and precision therapy. However, achieving unified annotation of DSCSs in heterogeneous single-cell datasets remains a challenge. In this study, we developed the gPRINT algorithm (generalized approach for cell subtype Identification with single cell's voicePRINT). Inspired by the principles of speech recognition in noisy environments, gPRINT transforms gene position and gene expression information into voiceprints based on ordered and clustered gene expression phenomena, obtaining unique \\\"gene print\\\" patterns for each cell. Then, we integrated neural networks to mitigate the impact of background noise on cell identity label mapping. We demonstrated the reproducibility of gPRINT across different donors, single-cell sequencing platforms, and disease subtypes, and its utility for automatic cell subtype annotation across datasets. Moreover, gPRINT achieved higher annotation accuracy of 98.37% when externally validated based on the same tissue, surpassing other algorithms. Furthermore, this approach has been applied to fibrosis-associated diseases in multiple tissues throughout the body, as well as to the annotation of fibroblast subtypes in a single tissue, tendon, where fibrosis is prevalent. We successfully achieved automatic prediction of tendinopathy-specific cell subtypes, key targets, and related drugs. In summary, gPRINT provides an automated and unified approach for identifying DSCSs across datasets, facilitating the elucidation of specific cell subtypes under different disease states and providing a powerful tool for exploring therapeutic targets in diseases.</p>\",\"PeriodicalId\":20790,\"journal\":{\"name\":\"Protein & Cell\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":13.6000,\"publicationDate\":\"2025-03-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Protein & Cell\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/procel/pwaf001\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CELL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Protein & Cell","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/procel/pwaf001","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

疾病特异性细胞亚型(dscs)的鉴定对了解疾病机制、术前诊断和精确治疗具有深远的意义。然而,在异构单细胞数据集中实现dscs的统一标注仍然是一个挑战。在这项研究中,我们开发了gPRINT算法(利用单细胞的声纹识别细胞亚型的通用方法)。gPRINT受噪声环境下语音识别原理的启发,基于有序和聚类的基因表达现象,将基因位置和基因表达信息转化为声纹,获得每个细胞独特的“基因印”模式。然后,我们结合神经网络来减轻背景噪声对细胞身份标签映射的影响。我们证明了gPRINT在不同供体、单细胞测序平台和疾病亚型中的可重复性,以及它在跨数据集的自动细胞亚型注释中的实用性。此外,gPRINT在基于同一组织的外部验证下,标注准确率达到了98.37%,超过了其他算法。此外,该方法已被应用于全身多个组织中的纤维化相关疾病,以及纤维化普遍存在的单个组织(肌腱)中成纤维细胞亚型的注释。我们成功地实现了肌腱病变特异性细胞亚型、关键靶点和相关药物的自动预测。总之,gPRINT为跨数据集识别dscs提供了一种自动化和统一的方法,有助于阐明不同疾病状态下的特定细胞亚型,并为探索疾病的治疗靶点提供了强大的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Gene print-based cell subtypes annotation of human disease across heterogeneous datasets with gPRINT.

Identification of disease-specific cell subtypes (DSCSs) has profound implications for understanding disease mechanisms, preoperative diagnosis, and precision therapy. However, achieving unified annotation of DSCSs in heterogeneous single-cell datasets remains a challenge. In this study, we developed the gPRINT algorithm (generalized approach for cell subtype Identification with single cell's voicePRINT). Inspired by the principles of speech recognition in noisy environments, gPRINT transforms gene position and gene expression information into voiceprints based on ordered and clustered gene expression phenomena, obtaining unique "gene print" patterns for each cell. Then, we integrated neural networks to mitigate the impact of background noise on cell identity label mapping. We demonstrated the reproducibility of gPRINT across different donors, single-cell sequencing platforms, and disease subtypes, and its utility for automatic cell subtype annotation across datasets. Moreover, gPRINT achieved higher annotation accuracy of 98.37% when externally validated based on the same tissue, surpassing other algorithms. Furthermore, this approach has been applied to fibrosis-associated diseases in multiple tissues throughout the body, as well as to the annotation of fibroblast subtypes in a single tissue, tendon, where fibrosis is prevalent. We successfully achieved automatic prediction of tendinopathy-specific cell subtypes, key targets, and related drugs. In summary, gPRINT provides an automated and unified approach for identifying DSCSs across datasets, facilitating the elucidation of specific cell subtypes under different disease states and providing a powerful tool for exploring therapeutic targets in diseases.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Protein & Cell
Protein & Cell CELL BIOLOGY-
CiteScore
24.00
自引率
0.90%
发文量
1029
审稿时长
6-12 weeks
期刊介绍: Protein & Cell is a monthly, peer-reviewed, open-access journal focusing on multidisciplinary aspects of biology and biomedicine, with a primary emphasis on protein and cell research. It publishes original research articles, reviews, and commentaries across various fields including biochemistry, biophysics, cell biology, genetics, immunology, microbiology, molecular biology, neuroscience, oncology, protein science, structural biology, and translational medicine. The journal also features content on research policies, funding trends in China, and serves as a platform for academic exchange among life science researchers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信