A Strategy to Compare Single-Cell RNA Sequencing Data Sets Provides Phenotypic Insight into Cellular Heterogeneity Underlying Biological Similarities and Differences Between Samples.
Dan C Wilkinson, Elizabeth Tallman, Mishal Ashraf, Tatiana Gelaf Romer, Jeehoon Lee, Benjamin Burnett, Pierre R Bushel
{"title":"A Strategy to Compare Single-Cell RNA Sequencing Data Sets Provides Phenotypic Insight into Cellular Heterogeneity Underlying Biological Similarities and Differences Between Samples.","authors":"Dan C Wilkinson, Elizabeth Tallman, Mishal Ashraf, Tatiana Gelaf Romer, Jeehoon Lee, Benjamin Burnett, Pierre R Bushel","doi":"10.1177/11779322241280866","DOIUrl":null,"url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) allows for an unbiased assessment of cellular phenotypes by enabling the extraction of transcriptomic data. An important question in downstream analysis is how to evaluate biological similarities and differences between samples in high dimensional space. This becomes especially complex when there is cellular heterogeneity within the samples. Here, we present scCompare, a computational pipeline for comparison of scRNA-seq data sets. Phenotypic identities from a known data set are transferred onto another data set using correlation-based mapping to average transcriptomic signatures from each cluster of cells' annotated phenotype. Statistically derived lower cutoffs for phenotype inclusivity allow for cells to be unmapped if they are distinct from the known phenotypes, facilitating potential novel cell type detection. In a comparison of our tool using scRNA-seq data sets from human peripheral blood mononuclear cells (PBMCs), we show that scCompare outperforms single-cell variational inference (scVI) in higher precision and sensitivity for most of the cell types. scCompare was used on a cardiomyocyte data set where it confirmed the discovery of a distinct cluster of cells that differed between the 2 protocols for differentiation. Further use of scCompare on cell atlas data sets revealed insights into the cellular heterogeneity underpinning biological diversity between samples. In addition, we used a cell atlas to better understand the effect of key parameters used in the scCompare pipeline. We envision that scCompare will be of value to the research community when comparing large scRNA-seq data sets.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"18 ","pages":"11779322241280866"},"PeriodicalIF":2.3000,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11457179/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics and Biology Insights","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/11779322241280866","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Single-cell RNA sequencing (scRNA-seq) allows for an unbiased assessment of cellular phenotypes by enabling the extraction of transcriptomic data. An important question in downstream analysis is how to evaluate biological similarities and differences between samples in high dimensional space. This becomes especially complex when there is cellular heterogeneity within the samples. Here, we present scCompare, a computational pipeline for comparison of scRNA-seq data sets. Phenotypic identities from a known data set are transferred onto another data set using correlation-based mapping to average transcriptomic signatures from each cluster of cells' annotated phenotype. Statistically derived lower cutoffs for phenotype inclusivity allow for cells to be unmapped if they are distinct from the known phenotypes, facilitating potential novel cell type detection. In a comparison of our tool using scRNA-seq data sets from human peripheral blood mononuclear cells (PBMCs), we show that scCompare outperforms single-cell variational inference (scVI) in higher precision and sensitivity for most of the cell types. scCompare was used on a cardiomyocyte data set where it confirmed the discovery of a distinct cluster of cells that differed between the 2 protocols for differentiation. Further use of scCompare on cell atlas data sets revealed insights into the cellular heterogeneity underpinning biological diversity between samples. In addition, we used a cell atlas to better understand the effect of key parameters used in the scCompare pipeline. We envision that scCompare will be of value to the research community when comparing large scRNA-seq data sets.
期刊介绍:
Bioinformatics and Biology Insights is an open access, peer-reviewed journal that considers articles on bioinformatics methods and their applications which must pertain to biological insights. All papers should be easily amenable to biologists and as such help bridge the gap between theories and applications.