Single-cell RNA sequencing data imputation using similarity preserving network

Duc Tran, Hung Nguyen, F. Harris, Tin Nguyen
{"title":"Single-cell RNA sequencing data imputation using similarity preserving network","authors":"Duc Tran, Hung Nguyen, F. Harris, Tin Nguyen","doi":"10.1109/KSE53942.2021.9648794","DOIUrl":null,"url":null,"abstract":"Recent advancements in single-cell RNA sequencing (scRNA-seq) technologies have allowed us to monitor the gene expression of individual cells. This level of detail in monitoring and characterization enables the research of cells in rapidly changing and heterogeneous environments such as early stage embryo or tumor tissue. However, the current scRNA-seq technologies are still facing many outstanding challenges. Due to the low amount of starting material, a large portion of expression values in scRNA-seq data is missing and reported as zeros. Moreover, scRNA-seq platforms are trending toward prioritizing high throughput over sequencing depth, which makes the problem become more serious in large datasets. These missing values can greatly affect the accuracy of downstream analyses. Here we introduce scINN, a neural network-based approach, that can reliably recover the missing values in single-cell data and thus can effectively improve the performance of downstream analyses. To impute the dropouts in single-cell data, we build a neural network that consists of two sub-networks: imputation sub-network and quality assessment sub-network. We compare scINN with state-of-the-art imputation methods using 10 scRNA-seq datasets with a total of more than 100,000 cells. In an extensive analysis, we demonstrate that scINN outperforms existing imputation methods in improving the identification of cell sub-populations and the Quality of transcriptome landscape visualization.","PeriodicalId":130986,"journal":{"name":"2021 13th International Conference on Knowledge and Systems Engineering (KSE)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 13th International Conference on Knowledge and Systems Engineering (KSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KSE53942.2021.9648794","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recent advancements in single-cell RNA sequencing (scRNA-seq) technologies have allowed us to monitor the gene expression of individual cells. This level of detail in monitoring and characterization enables the research of cells in rapidly changing and heterogeneous environments such as early stage embryo or tumor tissue. However, the current scRNA-seq technologies are still facing many outstanding challenges. Due to the low amount of starting material, a large portion of expression values in scRNA-seq data is missing and reported as zeros. Moreover, scRNA-seq platforms are trending toward prioritizing high throughput over sequencing depth, which makes the problem become more serious in large datasets. These missing values can greatly affect the accuracy of downstream analyses. Here we introduce scINN, a neural network-based approach, that can reliably recover the missing values in single-cell data and thus can effectively improve the performance of downstream analyses. To impute the dropouts in single-cell data, we build a neural network that consists of two sub-networks: imputation sub-network and quality assessment sub-network. We compare scINN with state-of-the-art imputation methods using 10 scRNA-seq datasets with a total of more than 100,000 cells. In an extensive analysis, we demonstrate that scINN outperforms existing imputation methods in improving the identification of cell sub-populations and the Quality of transcriptome landscape visualization.
基于相似性保持网络的单细胞RNA测序数据输入
单细胞RNA测序(scRNA-seq)技术的最新进展使我们能够监测单个细胞的基因表达。这种监测和表征的细节水平使得在快速变化和异质环境(如早期胚胎或肿瘤组织)中研究细胞成为可能。然而,目前的scRNA-seq技术仍然面临着许多突出的挑战。由于起始材料量少,scRNA-seq数据中有很大一部分表达值缺失,报告为零。此外,scRNA-seq平台倾向于优先考虑高通量而不是测序深度,这使得大数据集的问题变得更加严重。这些缺失的值会极大地影响下游分析的准确性。在这里,我们介绍了一种基于神经网络的方法scin,它可以可靠地恢复单细胞数据中的缺失值,从而有效地提高下游分析的性能。为了估算单细胞数据中的dropouts,我们构建了一个由两个子网络组成的神经网络:估算子网络和质量评估子网络。我们使用10个scRNA-seq数据集,总共超过10万个细胞,将scINN与最先进的代入方法进行了比较。在广泛的分析中,我们证明scin在提高细胞亚群鉴定和转录组景观可视化质量方面优于现有的imputation方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信