Lingling Miao, Loren Collado, Savannah Barkdull, Yoshine Saito, Jay-Hyun Jo, Jungmin Han, Stefania Dell'Orso, Michael C Kelly, Heidi H Kong, Isaac Brownell
{"title":"估算和纠正单细胞 RNA-seq 数据中的索引跳转错配。","authors":"Lingling Miao, Loren Collado, Savannah Barkdull, Yoshine Saito, Jay-Hyun Jo, Jungmin Han, Stefania Dell'Orso, Michael C Kelly, Heidi H Kong, Isaac Brownell","doi":"10.1101/2024.10.21.619353","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Index hopping causes read assignment errors in data from multiplexed sequencing libraries. This issue has become more prevalent with the widespread use of high-capacity sequencers and highly multiplexed single-cell RNA sequencing (scRNA- seq).</p><p><strong>Results: </strong>We conducted deep, plate-based scRNA-seq on a mixed population of mouse skin cells. Analysis of transcriptomes from 1152 cells identified four distinct cell types. To estimate the error rate in sample assignment due to index hopping, we employed differential expression analysis to identify signature genes that were highly and specifically expressed in each cell type. We quantified the proportion of misassigned reads by examining the detection rates of signature genes in other cell types. Remarkably, regardless of gene expression levels, we estimated that 0.65% of reads per gene were assigned to incorrect cell across our data. To computationally compensate for index hopping, we developed a simple correction method wherein, for each gene, 0.65% of the library's average expression level was subtracted from the expression in each cell. This correction had notable effects on transcriptome analyses, including increased cell-cell clustering distance and alterations in intermediate state assignments of cell differentiation.</p><p><strong>Conclusions: </strong>Index hopping misassignments are measurable and can impact the experimental interpretation of sequencing results. We devised a straightforward method to estimate and correct for the index hopping rate by quantifying misassigned genes in distinct cell types within an scRNA-seq library. This approach can be applied to any barcoded, multiplexed scRNA-seq library containing cells with distinct expression profiles, allowing for correction of the expression matrix before conducting biological analysis.</p>","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11527012/pdf/","citationCount":"0","resultStr":"{\"title\":\"Estimating and correcting index hopping misassignments in single-cell RNA-seq data.\",\"authors\":\"Lingling Miao, Loren Collado, Savannah Barkdull, Yoshine Saito, Jay-Hyun Jo, Jungmin Han, Stefania Dell'Orso, Michael C Kelly, Heidi H Kong, Isaac Brownell\",\"doi\":\"10.1101/2024.10.21.619353\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Index hopping causes read assignment errors in data from multiplexed sequencing libraries. This issue has become more prevalent with the widespread use of high-capacity sequencers and highly multiplexed single-cell RNA sequencing (scRNA- seq).</p><p><strong>Results: </strong>We conducted deep, plate-based scRNA-seq on a mixed population of mouse skin cells. Analysis of transcriptomes from 1152 cells identified four distinct cell types. To estimate the error rate in sample assignment due to index hopping, we employed differential expression analysis to identify signature genes that were highly and specifically expressed in each cell type. We quantified the proportion of misassigned reads by examining the detection rates of signature genes in other cell types. Remarkably, regardless of gene expression levels, we estimated that 0.65% of reads per gene were assigned to incorrect cell across our data. To computationally compensate for index hopping, we developed a simple correction method wherein, for each gene, 0.65% of the library's average expression level was subtracted from the expression in each cell. This correction had notable effects on transcriptome analyses, including increased cell-cell clustering distance and alterations in intermediate state assignments of cell differentiation.</p><p><strong>Conclusions: </strong>Index hopping misassignments are measurable and can impact the experimental interpretation of sequencing results. We devised a straightforward method to estimate and correct for the index hopping rate by quantifying misassigned genes in distinct cell types within an scRNA-seq library. This approach can be applied to any barcoded, multiplexed scRNA-seq library containing cells with distinct expression profiles, allowing for correction of the expression matrix before conducting biological analysis.</p>\",\"PeriodicalId\":519960,\"journal\":{\"name\":\"bioRxiv : the preprint server for biology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11527012/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv : the preprint server for biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.10.21.619353\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv : the preprint server for biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.10.21.619353","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Estimating and correcting index hopping misassignments in single-cell RNA-seq data.
Background: Index hopping causes read assignment errors in data from multiplexed sequencing libraries. This issue has become more prevalent with the widespread use of high-capacity sequencers and highly multiplexed single-cell RNA sequencing (scRNA- seq).
Results: We conducted deep, plate-based scRNA-seq on a mixed population of mouse skin cells. Analysis of transcriptomes from 1152 cells identified four distinct cell types. To estimate the error rate in sample assignment due to index hopping, we employed differential expression analysis to identify signature genes that were highly and specifically expressed in each cell type. We quantified the proportion of misassigned reads by examining the detection rates of signature genes in other cell types. Remarkably, regardless of gene expression levels, we estimated that 0.65% of reads per gene were assigned to incorrect cell across our data. To computationally compensate for index hopping, we developed a simple correction method wherein, for each gene, 0.65% of the library's average expression level was subtracted from the expression in each cell. This correction had notable effects on transcriptome analyses, including increased cell-cell clustering distance and alterations in intermediate state assignments of cell differentiation.
Conclusions: Index hopping misassignments are measurable and can impact the experimental interpretation of sequencing results. We devised a straightforward method to estimate and correct for the index hopping rate by quantifying misassigned genes in distinct cell types within an scRNA-seq library. This approach can be applied to any barcoded, multiplexed scRNA-seq library containing cells with distinct expression profiles, allowing for correction of the expression matrix before conducting biological analysis.