scVGAE: A Novel Approach using ZINB-Based Variational Graph Autoencoder for Single-Cell RNA-Seq Imputation

Yoshitaka Inoue
{"title":"scVGAE: A Novel Approach using ZINB-Based Variational Graph Autoencoder for Single-Cell RNA-Seq Imputation","authors":"Yoshitaka Inoue","doi":"arxiv-2403.08959","DOIUrl":null,"url":null,"abstract":"Single-cell RNA sequencing (scRNA-seq) has revolutionized our ability to\nstudy individual cellular distinctions and uncover unique cell characteristics.\nHowever, a significant technical challenge in scRNA-seq analysis is the\noccurrence of \"dropout\" events, where certain gene expressions cannot be\ndetected. This issue is particularly pronounced in genes with low or sparse\nexpression levels, impacting the precision and interpretability of the obtained\ndata. To address this challenge, various imputation methods have been\nimplemented to predict such missing values, aiming to enhance the analysis's\naccuracy and usefulness. A prevailing hypothesis posits that scRNA-seq data\nconforms to a zero-inflated negative binomial (ZINB) distribution.\nConsequently, methods have been developed to model the data according to this\ndistribution. Recent trends in scRNA-seq analysis have seen the emergence of\ndeep learning approaches. Some techniques, such as the variational autoencoder,\nincorporate the ZINB distribution as a model loss function. Graph-based methods\nlike Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT) have\nalso gained attention as deep learning methodologies for scRNA-seq analysis.\nThis study introduces scVGAE, an innovative approach integrating GCN into a\nvariational autoencoder framework while utilizing a ZINB loss function. This\nintegration presents a promising avenue for effectively addressing dropout\nevents in scRNA-seq data, thereby enhancing the accuracy and reliability of\ndownstream analyses. scVGAE outperforms other methods in cell clustering, with\nthe best performance in 11 out of 14 datasets. Ablation study shows all\ncomponents of scVGAE are necessary. scVGAE is implemented in Python and\ndownloadable at https://github.com/inoue0426/scVGAE.","PeriodicalId":501070,"journal":{"name":"arXiv - QuanBio - Genomics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.08959","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Single-cell RNA sequencing (scRNA-seq) has revolutionized our ability to study individual cellular distinctions and uncover unique cell characteristics. However, a significant technical challenge in scRNA-seq analysis is the occurrence of "dropout" events, where certain gene expressions cannot be detected. This issue is particularly pronounced in genes with low or sparse expression levels, impacting the precision and interpretability of the obtained data. To address this challenge, various imputation methods have been implemented to predict such missing values, aiming to enhance the analysis's accuracy and usefulness. A prevailing hypothesis posits that scRNA-seq data conforms to a zero-inflated negative binomial (ZINB) distribution. Consequently, methods have been developed to model the data according to this distribution. Recent trends in scRNA-seq analysis have seen the emergence of deep learning approaches. Some techniques, such as the variational autoencoder, incorporate the ZINB distribution as a model loss function. Graph-based methods like Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT) have also gained attention as deep learning methodologies for scRNA-seq analysis. This study introduces scVGAE, an innovative approach integrating GCN into a variational autoencoder framework while utilizing a ZINB loss function. This integration presents a promising avenue for effectively addressing dropout events in scRNA-seq data, thereby enhancing the accuracy and reliability of downstream analyses. scVGAE outperforms other methods in cell clustering, with the best performance in 11 out of 14 datasets. Ablation study shows all components of scVGAE are necessary. scVGAE is implemented in Python and downloadable at https://github.com/inoue0426/scVGAE.
scVGAE:使用基于 ZINB 的变异图自动编码器进行单细胞 RNA-Seq 估算的新方法
单细胞 RNA 测序(scRNA-seq)彻底改变了我们研究单个细胞差异和揭示独特细胞特征的能力。然而,scRNA-seq 分析中的一个重大技术挑战是出现 "脱落 "事件,即无法检测到某些基因的表达。这一问题在表达水平较低或稀少的基因中尤为突出,影响了所获数据的精确性和可解释性。为了应对这一挑战,人们采用了各种估算方法来预测这类缺失值,以提高分析的准确性和实用性。一种流行的假设认为,scRNA-seq 数据符合零膨胀负二项分布(ZINB)。最近,scRNA-seq 分析领域出现了深度学习方法。一些技术(如变异自动编码器)将 ZINB 分布作为模型损失函数。图卷积网络(Graph Convolutional Networks,GCN)和图注意力网络(Graph Attention Networks,GAT)等基于图的方法作为用于 scRNA-seq 分析的深度学习方法也受到了关注。本研究介绍了 scVGAE,这是一种将 GCN 集成到变异自动编码器框架中的创新方法,同时利用了 ZINB 损失函数。scVGAE 在细胞聚类方面的表现优于其他方法,在 14 个数据集中的 11 个数据集中表现最佳。消融研究表明 scVGAE 的所有组件都是必要的。scVGAE 用 Python 实现,可在 https://github.com/inoue0426/scVGAE 下载。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信