估算和纠正单细胞 RNA-seq 数据中的索引跳转错配。

Lingling Miao, Loren Collado, Savannah Barkdull, Yoshine Saito, Jay-Hyun Jo, Jungmin Han, Stefania Dell'Orso, Michael C Kelly, Heidi H Kong, Isaac Brownell
{"title":"估算和纠正单细胞 RNA-seq 数据中的索引跳转错配。","authors":"Lingling Miao, Loren Collado, Savannah Barkdull, Yoshine Saito, Jay-Hyun Jo, Jungmin Han, Stefania Dell'Orso, Michael C Kelly, Heidi H Kong, Isaac Brownell","doi":"10.1101/2024.10.21.619353","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Index hopping causes read assignment errors in data from multiplexed sequencing libraries. This issue has become more prevalent with the widespread use of high-capacity sequencers and highly multiplexed single-cell RNA sequencing (scRNA- seq).</p><p><strong>Results: </strong>We conducted deep, plate-based scRNA-seq on a mixed population of mouse skin cells. Analysis of transcriptomes from 1152 cells identified four distinct cell types. To estimate the error rate in sample assignment due to index hopping, we employed differential expression analysis to identify signature genes that were highly and specifically expressed in each cell type. We quantified the proportion of misassigned reads by examining the detection rates of signature genes in other cell types. Remarkably, regardless of gene expression levels, we estimated that 0.65% of reads per gene were assigned to incorrect cell across our data. To computationally compensate for index hopping, we developed a simple correction method wherein, for each gene, 0.65% of the library's average expression level was subtracted from the expression in each cell. This correction had notable effects on transcriptome analyses, including increased cell-cell clustering distance and alterations in intermediate state assignments of cell differentiation.</p><p><strong>Conclusions: </strong>Index hopping misassignments are measurable and can impact the experimental interpretation of sequencing results. We devised a straightforward method to estimate and correct for the index hopping rate by quantifying misassigned genes in distinct cell types within an scRNA-seq library. This approach can be applied to any barcoded, multiplexed scRNA-seq library containing cells with distinct expression profiles, allowing for correction of the expression matrix before conducting biological analysis.</p>","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11527012/pdf/","citationCount":"0","resultStr":"{\"title\":\"Estimating and correcting index hopping misassignments in single-cell RNA-seq data.\",\"authors\":\"Lingling Miao, Loren Collado, Savannah Barkdull, Yoshine Saito, Jay-Hyun Jo, Jungmin Han, Stefania Dell'Orso, Michael C Kelly, Heidi H Kong, Isaac Brownell\",\"doi\":\"10.1101/2024.10.21.619353\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Index hopping causes read assignment errors in data from multiplexed sequencing libraries. This issue has become more prevalent with the widespread use of high-capacity sequencers and highly multiplexed single-cell RNA sequencing (scRNA- seq).</p><p><strong>Results: </strong>We conducted deep, plate-based scRNA-seq on a mixed population of mouse skin cells. Analysis of transcriptomes from 1152 cells identified four distinct cell types. To estimate the error rate in sample assignment due to index hopping, we employed differential expression analysis to identify signature genes that were highly and specifically expressed in each cell type. We quantified the proportion of misassigned reads by examining the detection rates of signature genes in other cell types. Remarkably, regardless of gene expression levels, we estimated that 0.65% of reads per gene were assigned to incorrect cell across our data. To computationally compensate for index hopping, we developed a simple correction method wherein, for each gene, 0.65% of the library's average expression level was subtracted from the expression in each cell. This correction had notable effects on transcriptome analyses, including increased cell-cell clustering distance and alterations in intermediate state assignments of cell differentiation.</p><p><strong>Conclusions: </strong>Index hopping misassignments are measurable and can impact the experimental interpretation of sequencing results. We devised a straightforward method to estimate and correct for the index hopping rate by quantifying misassigned genes in distinct cell types within an scRNA-seq library. This approach can be applied to any barcoded, multiplexed scRNA-seq library containing cells with distinct expression profiles, allowing for correction of the expression matrix before conducting biological analysis.</p>\",\"PeriodicalId\":519960,\"journal\":{\"name\":\"bioRxiv : the preprint server for biology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11527012/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv : the preprint server for biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.10.21.619353\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv : the preprint server for biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.10.21.619353","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

索引跳转会导致来自多重测序文库的数据出现读数分配错误。随着大容量测序仪和高度复用的单细胞 RNA 测序(scRNA-seq)文库的广泛使用,这一问题变得越来越普遍。我们对小鼠皮肤细胞混合群体进行了基于平板的深度 scRNA-seq。对来自 1152 个细胞的转录组的分析确定了四种不同的细胞类型。为了估算因索引跳转造成的样本分配错误率,我们采用了差异表达分析来确定在每种细胞类型中高度特异表达的特征基因。我们通过检测其他细胞类型中特征基因的检测率,量化了错误分配读数的比例。值得注意的是,无论基因表达水平如何,我们估计在所有数据中,每个基因有 0.65% 的读数被分配到了错误的细胞中。为了对索引跳转进行计算补偿,我们开发了一种简单的校正方法,即对每个基因,从每个细胞的表达量中减去文库平均表达水平的 0.65%。这种校正方法对转录组分析产生了显著的影响,包括细胞-细胞聚类距离的增加和细胞分化中间状态分配的改变。这些发现强调了指数跳跃对实验结果的潜在影响。总之,我们设计了一种简单易行的方法,通过量化 scRNA-seq 文库中不同细胞类型的错误分配基因来估计和纠正索引跳转率。这种方法可应用于任何含有不同表达谱细胞的条形码多重 scRNA-seq 文库,从而在进行生物学分析之前对表达矩阵进行校正。这可能导致单细胞 RNA 测序(scRNA-seq)中基因表达分配到错误的细胞。我们对分选的小鼠皮肤细胞进行了 scRNA-seq,并根据基因表达谱确定了四种不同的细胞类型。利用每种细胞类型所特有的基因,我们发现每个基因约有 0.65% 的总读数因索引跳转而被错误地分配到另一个细胞,与基因表达水平无关。为了纠正错误分配的读数,我们按比例调整了每个细胞的基因表达数据。应用这种校正方法改进了分析,从而提高了细胞聚类的准确性,并完善了细胞分化过程中中间细胞状态的识别。我们的研究强调了scRNA-seq实验中索引跳转的重要性,并提出了一种实用的校正方法来去除错误分配的读数,该方法可应用于任何涉及具有不同基因表达谱的细胞的scRNA-seq研究,或在文库制备过程中作为质量控制措施引入非表达基因的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Estimating and correcting index hopping misassignments in single-cell RNA-seq data.

Background: Index hopping causes read assignment errors in data from multiplexed sequencing libraries. This issue has become more prevalent with the widespread use of high-capacity sequencers and highly multiplexed single-cell RNA sequencing (scRNA- seq).

Results: We conducted deep, plate-based scRNA-seq on a mixed population of mouse skin cells. Analysis of transcriptomes from 1152 cells identified four distinct cell types. To estimate the error rate in sample assignment due to index hopping, we employed differential expression analysis to identify signature genes that were highly and specifically expressed in each cell type. We quantified the proportion of misassigned reads by examining the detection rates of signature genes in other cell types. Remarkably, regardless of gene expression levels, we estimated that 0.65% of reads per gene were assigned to incorrect cell across our data. To computationally compensate for index hopping, we developed a simple correction method wherein, for each gene, 0.65% of the library's average expression level was subtracted from the expression in each cell. This correction had notable effects on transcriptome analyses, including increased cell-cell clustering distance and alterations in intermediate state assignments of cell differentiation.

Conclusions: Index hopping misassignments are measurable and can impact the experimental interpretation of sequencing results. We devised a straightforward method to estimate and correct for the index hopping rate by quantifying misassigned genes in distinct cell types within an scRNA-seq library. This approach can be applied to any barcoded, multiplexed scRNA-seq library containing cells with distinct expression profiles, allowing for correction of the expression matrix before conducting biological analysis.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信