Yan Zhong, Yuntong Hou, Yongjian Yang, Xinyue Zheng, James J Cai
{"title":"题目:峰与基因的鲁棒共嵌入揭示峰基因调控。","authors":"Yan Zhong, Yuntong Hou, Yongjian Yang, Xinyue Zheng, James J Cai","doi":"10.1093/bioinformatics/btaf483","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Identifying regulatory elements in various chromosomal regions that influence gene expression is a fundamental challenge in epigenomics, with profound implications for understanding gene regulation and disease mechanisms. The advent of paired single-cell RNA sequencing and single-cell ATAC sequencing has created unprecedented opportunities to address this challenge by enabling simultaneous profiling of gene expression and chromatin accessibility at single-cell resolution. However, the inherent signals between them are weak due to the highly sparse and noisy nature of data.</p><p><strong>Results: </strong>This article proposes single-cell meta-Path based Omics Embedding (scPOEM), a novel embedding method that jointly projects chromatin accessibility peaks and expressed genes into a shared low-dimensional space. By integrating the relationships among peak-peak, peak-gene, and gene-gene interactions, scPOEM assigns closer representations in the embedding space to related peak-gene pairs. Our experiments demonstrate that scPOEM generates stable representations of peaks and genes, outperforms existing methods in recovering biologically meaningful peak-gene regulatory relationships and enables new insights in subgroup and differential analysis of gene regulation. These results highlight its potential to uncover gene regulatory mechanisms and enhance the understanding of transcriptional regulation at single-cell resolution.</p><p><strong>Availability and implementation: </strong>The source code of scPOEM is available at https://github.com/Houyt23/scPOEM. The datasets can be obtained from the 10× Genomics (https://www.10xgenomics.com/datasets/pbmc-from-a-healthy-donor-granulocytes-removed-through-cell-sorting-10-k-1-standard-1-0-0) and GEO database under access codes GSE194122 and GSE239916.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12449255/pdf/","citationCount":"0","resultStr":"{\"title\":\"scPOEM: robust co-embedding of peaks and genes revealing peak-gene regulation.\",\"authors\":\"Yan Zhong, Yuntong Hou, Yongjian Yang, Xinyue Zheng, James J Cai\",\"doi\":\"10.1093/bioinformatics/btaf483\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Motivation: </strong>Identifying regulatory elements in various chromosomal regions that influence gene expression is a fundamental challenge in epigenomics, with profound implications for understanding gene regulation and disease mechanisms. The advent of paired single-cell RNA sequencing and single-cell ATAC sequencing has created unprecedented opportunities to address this challenge by enabling simultaneous profiling of gene expression and chromatin accessibility at single-cell resolution. However, the inherent signals between them are weak due to the highly sparse and noisy nature of data.</p><p><strong>Results: </strong>This article proposes single-cell meta-Path based Omics Embedding (scPOEM), a novel embedding method that jointly projects chromatin accessibility peaks and expressed genes into a shared low-dimensional space. By integrating the relationships among peak-peak, peak-gene, and gene-gene interactions, scPOEM assigns closer representations in the embedding space to related peak-gene pairs. Our experiments demonstrate that scPOEM generates stable representations of peaks and genes, outperforms existing methods in recovering biologically meaningful peak-gene regulatory relationships and enables new insights in subgroup and differential analysis of gene regulation. These results highlight its potential to uncover gene regulatory mechanisms and enhance the understanding of transcriptional regulation at single-cell resolution.</p><p><strong>Availability and implementation: </strong>The source code of scPOEM is available at https://github.com/Houyt23/scPOEM. The datasets can be obtained from the 10× Genomics (https://www.10xgenomics.com/datasets/pbmc-from-a-healthy-donor-granulocytes-removed-through-cell-sorting-10-k-1-standard-1-0-0) and GEO database under access codes GSE194122 and GSE239916.</p>\",\"PeriodicalId\":93899,\"journal\":{\"name\":\"Bioinformatics (Oxford, England)\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12449255/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics (Oxford, England)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioinformatics/btaf483\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf483","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
动机:识别影响基因表达的不同染色体区域的调控元件是表观基因组学的一个基本挑战,对理解基因调控和疾病机制具有深远意义。配对单细胞RNA测序和单细胞ATAC测序的出现,通过在单细胞分辨率下同时分析基因表达和染色质可及性,为解决这一挑战创造了前所未有的机会。然而,由于数据的高度稀疏和噪声特性,它们之间的固有信号很弱。结果:本文提出了基于单细胞元路径的组学嵌入(scPOEM, single-cell meta-Path based Omics EMbedding)方法,该方法将染色质可及性峰和表达基因共同投射到共享的低维空间中。通过整合峰、峰基因和基因-基因相互作用之间的关系,scPOEM在嵌入空间中为相关的峰基因对分配更接近的表示。我们的实验表明,scPOEM生成了峰和基因的稳定表示,在恢复具有生物学意义的峰基因调控关系方面优于现有方法,并为基因调控的亚群和差异分析提供了新的见解。这些结果突出了它在揭示基因调控机制和提高对单细胞分辨率转录调控的理解方面的潜力。可用性:scPOEM的源代码可从https://github.com/Houyt23/scPOEM获得。数据集可从10x Genomics (https://www.10xgenomics.com/datasets/pbmc-from-a-healthy-donor-granulocytes-removed-through-cell-sorting-10-k-1-standard-1-0-0)和GEO数据库获得,访问代码为GSE194122和GSE239916。补充信息:补充数据可在生物信息学在线获取。
scPOEM: robust co-embedding of peaks and genes revealing peak-gene regulation.
Motivation: Identifying regulatory elements in various chromosomal regions that influence gene expression is a fundamental challenge in epigenomics, with profound implications for understanding gene regulation and disease mechanisms. The advent of paired single-cell RNA sequencing and single-cell ATAC sequencing has created unprecedented opportunities to address this challenge by enabling simultaneous profiling of gene expression and chromatin accessibility at single-cell resolution. However, the inherent signals between them are weak due to the highly sparse and noisy nature of data.
Results: This article proposes single-cell meta-Path based Omics Embedding (scPOEM), a novel embedding method that jointly projects chromatin accessibility peaks and expressed genes into a shared low-dimensional space. By integrating the relationships among peak-peak, peak-gene, and gene-gene interactions, scPOEM assigns closer representations in the embedding space to related peak-gene pairs. Our experiments demonstrate that scPOEM generates stable representations of peaks and genes, outperforms existing methods in recovering biologically meaningful peak-gene regulatory relationships and enables new insights in subgroup and differential analysis of gene regulation. These results highlight its potential to uncover gene regulatory mechanisms and enhance the understanding of transcriptional regulation at single-cell resolution.
Availability and implementation: The source code of scPOEM is available at https://github.com/Houyt23/scPOEM. The datasets can be obtained from the 10× Genomics (https://www.10xgenomics.com/datasets/pbmc-from-a-healthy-donor-granulocytes-removed-through-cell-sorting-10-k-1-standard-1-0-0) and GEO database under access codes GSE194122 and GSE239916.