scE2EGAE: enhancing single-cell RNA-Seq data analysis through an end-to-end cell-graph-learnable graph autoencoder with differentiable edge sampling.

IF 5.7 2区 生物学 Q1 BIOLOGY
Shuo Wang, Yuanning Liu, Hao Zhang, Zhen Liu
{"title":"scE2EGAE: enhancing single-cell RNA-Seq data analysis through an end-to-end cell-graph-learnable graph autoencoder with differentiable edge sampling.","authors":"Shuo Wang, Yuanning Liu, Hao Zhang, Zhen Liu","doi":"10.1186/s13062-025-00616-z","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Single-cell RNA sequencing (scRNA-Seq) technology reveals biological processes and molecular-level genomic information among individual cells. Numerous computational methods, including methods based on graph neural networks (GNNs), have been developed to enhance scRNA-Seq data analysis. However, existing GNNs-based methods usually construct fixed graphs by applying the k-nearest neighbors algorithm, which may result in information loss.</p><p><strong>Methods: </strong>To address this problem, we propose scE2EGAE, which learns cell graphs during the training processes. Firstly, the scRNA-Seq data is fed into a deep count autoencoder (DCA). Secondly, the hidden representations of DCA are extracted and then used to generate cell-to-cell graph edges through a straight-through estimator (STE) based on top-k sampling and Gumbel-Softmax. Finally, the generated cell-to-cell graph and scRNA-Seq data are fed into the GNNs-based downstream tasks. In this paper, we design a graph autoencoder which performs denoising on scRNA-Seq data as the downstream task.</p><p><strong>Results: </strong>We evaluate scE2EGAE on eight public scRNA-Seq datasets and compare its performance with seven existing scRNA-Seq data denoising methods. In this paper, extensive experiments are conducted, encompassing: 1) the evaluation of denoising performance, with metrics including mean absolute error, Pearson correlation coefficient, and cosine similarity; 2) the assessment of clustering performance of the denoised results, utilizing adjusted rand index, normalized mutual information and silhouette score; and 3) the evaluation of the cell trajectory inference performance of the denoised results, measured by the pseudo-temporal ordering score. The results show that, on the scRNA-Seq data denoising task, scE2EGAE outperforms most of the methods, proving that it can learn cell-to-cell graphs containing real information of cell-to-cell relationships.</p><p><strong>Conclusions: </strong>In this paper, we validate the proposed scE2EGAE method through its application to the denoising task of scRNA-Seq data. This method demonstrates its capability to learn inter-cellular relationships and construct cell-to-cell graphs, thereby enhancing the downstream analysis of scRNA-Seq data. Our approach can serve as an inspiration for future research on scRNA-Seq analysis methods based on GNNs, holding broad application prospects.</p>","PeriodicalId":9164,"journal":{"name":"Biology Direct","volume":"20 1","pages":"66"},"PeriodicalIF":5.7000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12108024/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biology Direct","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13062-025-00616-z","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Single-cell RNA sequencing (scRNA-Seq) technology reveals biological processes and molecular-level genomic information among individual cells. Numerous computational methods, including methods based on graph neural networks (GNNs), have been developed to enhance scRNA-Seq data analysis. However, existing GNNs-based methods usually construct fixed graphs by applying the k-nearest neighbors algorithm, which may result in information loss.

Methods: To address this problem, we propose scE2EGAE, which learns cell graphs during the training processes. Firstly, the scRNA-Seq data is fed into a deep count autoencoder (DCA). Secondly, the hidden representations of DCA are extracted and then used to generate cell-to-cell graph edges through a straight-through estimator (STE) based on top-k sampling and Gumbel-Softmax. Finally, the generated cell-to-cell graph and scRNA-Seq data are fed into the GNNs-based downstream tasks. In this paper, we design a graph autoencoder which performs denoising on scRNA-Seq data as the downstream task.

Results: We evaluate scE2EGAE on eight public scRNA-Seq datasets and compare its performance with seven existing scRNA-Seq data denoising methods. In this paper, extensive experiments are conducted, encompassing: 1) the evaluation of denoising performance, with metrics including mean absolute error, Pearson correlation coefficient, and cosine similarity; 2) the assessment of clustering performance of the denoised results, utilizing adjusted rand index, normalized mutual information and silhouette score; and 3) the evaluation of the cell trajectory inference performance of the denoised results, measured by the pseudo-temporal ordering score. The results show that, on the scRNA-Seq data denoising task, scE2EGAE outperforms most of the methods, proving that it can learn cell-to-cell graphs containing real information of cell-to-cell relationships.

Conclusions: In this paper, we validate the proposed scE2EGAE method through its application to the denoising task of scRNA-Seq data. This method demonstrates its capability to learn inter-cellular relationships and construct cell-to-cell graphs, thereby enhancing the downstream analysis of scRNA-Seq data. Our approach can serve as an inspiration for future research on scRNA-Seq analysis methods based on GNNs, holding broad application prospects.

scE2EGAE:通过具有可微边缘采样的端到端细胞图可学习图自编码器增强单细胞RNA-Seq数据分析。
背景:单细胞RNA测序(scRNA-Seq)技术揭示了单个细胞之间的生物过程和分子水平的基因组信息。许多计算方法,包括基于图神经网络(gnn)的方法,已经被开发出来以增强scRNA-Seq数据分析。然而,现有的基于gnns的方法通常采用k近邻算法构建固定图,这可能导致信息丢失。方法:为了解决这个问题,我们提出了在训练过程中学习细胞图的scE2EGAE。首先,将scRNA-Seq数据送入深度计数自编码器(DCA)。其次,提取DCA的隐藏表示,然后通过基于top-k采样和Gumbel-Softmax的直通估计器(STE)生成细胞到细胞的图边。最后,生成的细胞间图和scRNA-Seq数据被输入到基于gnns的下游任务中。在本文中,我们设计了一个图形自编码器,将scRNA-Seq数据去噪作为下游任务。结果:我们在8个公开的scRNA-Seq数据集上对scE2EGAE进行了评估,并将其性能与现有的7种scRNA-Seq数据去噪方法进行了比较。本文进行了大量的实验,包括:1)评估去噪性能,指标包括平均绝对误差、Pearson相关系数和余弦相似度;2)利用调整后的rand指数、归一化互信息和剪影评分对去噪结果的聚类性能进行评价;3)用伪时间排序分数评价去噪结果的细胞轨迹推断性能。结果表明,在scRNA-Seq数据去噪任务上,scE2EGAE优于大多数方法,证明它可以学习到包含细胞间关系真实信息的细胞间图。结论:本文通过将提出的scE2EGAE方法应用于scRNA-Seq数据的去噪任务,验证了该方法的有效性。该方法证明了其学习细胞间关系和构建细胞间图的能力,从而增强了scRNA-Seq数据的下游分析。我们的方法可以为未来基于GNNs的scRNA-Seq分析方法的研究提供启发,具有广阔的应用前景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biology Direct
Biology Direct 生物-生物学
CiteScore
6.40
自引率
10.90%
发文量
32
审稿时长
7 months
期刊介绍: Biology Direct serves the life science research community as an open access, peer-reviewed online journal, providing authors and readers with an alternative to the traditional model of peer review. Biology Direct considers original research articles, hypotheses, comments, discovery notes and reviews in subject areas currently identified as those most conducive to the open review approach, primarily those with a significant non-experimental component.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信