Enhancing Single-Cell RNA-seq Data Completeness with a Graph Learning Framework.

IF 3.6 3区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS
Snehalika Lall, Sumanta Ray, Sanghamitra Bandyopadhyay
{"title":"Enhancing Single-Cell RNA-seq Data Completeness with a Graph Learning Framework.","authors":"Snehalika Lall, Sumanta Ray, Sanghamitra Bandyopadhyay","doi":"10.1109/TCBB.2024.3492384","DOIUrl":null,"url":null,"abstract":"<p><p>Single cell RNA sequencing (scRNA-seq) is a powerful tool to capture gene expression snapshots in individual cells. However, a low amount of RNA in the individual cells results in dropout events, which introduce huge zero counts in the single cell expression matrix. We have developed VAImpute, a variational graph autoencoder based imputation technique that learns the inherent distribution of a large network/graph constructed from the scRNA-seq data leveraging copula correlation ( Ccor) among cells/genes. The trained model is utilized to predict the dropouts events by computing the probability of all non-edges (cell-gene) in the network. We devise an algorithm to impute the missing expression values of the detected dropouts. The performance of the proposed model is assessed on both simulated and real scRNA-seq datasets, comparing it to established single-cell imputation methods. VAImpute yields significant improvements to detect dropouts, thereby achieving superior performance in cell clustering, detecting rare cells, and differential expression. All codes and datasets are given in the github link: https://github.com/sumantaray/VAImputeAvailability.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":3.6000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/TCBB.2024.3492384","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Single cell RNA sequencing (scRNA-seq) is a powerful tool to capture gene expression snapshots in individual cells. However, a low amount of RNA in the individual cells results in dropout events, which introduce huge zero counts in the single cell expression matrix. We have developed VAImpute, a variational graph autoencoder based imputation technique that learns the inherent distribution of a large network/graph constructed from the scRNA-seq data leveraging copula correlation ( Ccor) among cells/genes. The trained model is utilized to predict the dropouts events by computing the probability of all non-edges (cell-gene) in the network. We devise an algorithm to impute the missing expression values of the detected dropouts. The performance of the proposed model is assessed on both simulated and real scRNA-seq datasets, comparing it to established single-cell imputation methods. VAImpute yields significant improvements to detect dropouts, thereby achieving superior performance in cell clustering, detecting rare cells, and differential expression. All codes and datasets are given in the github link: https://github.com/sumantaray/VAImputeAvailability.

利用图形学习框架提高单细胞 RNA-seq 数据的完整性。
单细胞 RNA 测序(scRNA-seq)是捕捉单个细胞基因表达快照的强大工具。然而,由于单个细胞中的 RNA 含量较低,因此会出现丢失事件,从而在单细胞表达矩阵中引入大量零计数。我们开发的 VAImpute 是一种基于变异图自动编码器的估算技术,它利用细胞/基因间的 copula correlation ( Ccor) 学习由 scRNA-seq 数据构建的大型网络/图的固有分布。通过计算网络中所有非边(细胞-基因)的概率,利用训练好的模型预测掉线事件。我们还设计了一种算法,对检测到的缺失表达值进行补偿。我们在模拟和真实的 scRNA-seq 数据集上评估了拟议模型的性能,并将其与已有的单细胞估算方法进行了比较。VAImpute 在检测缺失方面有显著改进,因此在细胞聚类、检测稀有细胞和差异表达方面表现出色。所有代码和数据集都在 github 链接中提供:https://github.com/sumantaray/VAImputeAvailability。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.50
自引率
6.70%
发文量
479
审稿时长
3 months
期刊介绍: IEEE/ACM Transactions on Computational Biology and Bioinformatics emphasizes the algorithmic, mathematical, statistical and computational methods that are central in bioinformatics and computational biology; the development and testing of effective computer programs in bioinformatics; the development of biological databases; and important biological results that are obtained from the use of these methods, programs and databases; the emerging field of Systems Biology, where many forms of data are used to create a computer-based model of a complex biological system
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信