{"title":"SCMcluster: a high-precision cell clustering algorithm integrating marker gene set with single-cell RNA sequencing data.","authors":"Hao Wu, Haoru Zhou, Bing Zhou, Meili Wang","doi":"10.1093/bfgp/elad004","DOIUrl":null,"url":null,"abstract":"<p><p>Single-cell clustering is the most significant part of single-cell RNA sequencing (scRNA-seq) data analysis. One main issue facing the scRNA-seq data is noise and sparsity, which poses a great challenge for the advance of high-precision clustering algorithms. This study adopts cellular markers to identify differences between cells, which contributes to feature extraction of single cells. In this work, we propose a high-precision single-cell clustering algorithm-SCMcluster (single-cell cluster using marker genes). This algorithm integrates two cell marker databases(CellMarker database and PanglaoDB database) with scRNA-seq data for feature extraction and constructs an ensemble clustering model based on the consensus matrix. We test the efficiency of this algorithm and compare it with other eight popular clustering algorithms on two scRNA-seq datasets derived from human and mouse tissues, respectively. The experimental results show that SCMcluster outperforms the existing methods in both feature extraction and clustering performance. The source code of SCMcluster is available for free at https://github.com/HaoWuLab-Bioinformatics/SCMcluster.</p>","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2023-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bfgp/elad004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0
Abstract
Single-cell clustering is the most significant part of single-cell RNA sequencing (scRNA-seq) data analysis. One main issue facing the scRNA-seq data is noise and sparsity, which poses a great challenge for the advance of high-precision clustering algorithms. This study adopts cellular markers to identify differences between cells, which contributes to feature extraction of single cells. In this work, we propose a high-precision single-cell clustering algorithm-SCMcluster (single-cell cluster using marker genes). This algorithm integrates two cell marker databases(CellMarker database and PanglaoDB database) with scRNA-seq data for feature extraction and constructs an ensemble clustering model based on the consensus matrix. We test the efficiency of this algorithm and compare it with other eight popular clustering algorithms on two scRNA-seq datasets derived from human and mouse tissues, respectively. The experimental results show that SCMcluster outperforms the existing methods in both feature extraction and clustering performance. The source code of SCMcluster is available for free at https://github.com/HaoWuLab-Bioinformatics/SCMcluster.