Jian-ping Zhao, Bo Yang, Hai-yun Wang, Chunhan Zheng
{"title":"scSDSC: Self-supervised Deep Subspace Clustering for scRNA-seq Data","authors":"Jian-ping Zhao, Bo Yang, Hai-yun Wang, Chunhan Zheng","doi":"10.2174/1574893618666230816090443","DOIUrl":null,"url":null,"abstract":"\n\nSingle-cell RNA sequencing(scRNA-seq) data can identify heterogeneity between cells, thereby identifying cell types and discovering rare cell types. Clustering is often used to identify cell types, but the high noise and high dimension of scRNA-seq lead to the degradation of clustering performance and impact downstream analysis. Deep learning is widely used in this field, which provides promising performance in feature learning.\n\n\n\nMost deep learning models only consider the relationship between genes, ignore the relationship between cells. We try to use the relationships between cells and the relationships between genes to construct clustering models.\n\n\n\nWe proposed scSDSC: a deep subspace cluster architecture that considers the relationships between genes and cells at the same time. Similar to deep subspace clustering (DSC), we added a fully connected layer after the embedding layer to obtain the self-expression matrix. In addition, we also added a fully connected SoftMax layer to generate the pseudo-label and used the information carried by the pseudo-label for model training. Finally, the affinity matrix is obtained for spectral clustering.\n\n\n\nExperimental results on eight real datasets show that scSDSC outperforms existing methods in downstream analysis.\n\n\n\nOur method plays an important role in improving clustering accuracy and downstream analysis.\n","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.2174/1574893618666230816090443","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Single-cell RNA sequencing(scRNA-seq) data can identify heterogeneity between cells, thereby identifying cell types and discovering rare cell types. Clustering is often used to identify cell types, but the high noise and high dimension of scRNA-seq lead to the degradation of clustering performance and impact downstream analysis. Deep learning is widely used in this field, which provides promising performance in feature learning.
Most deep learning models only consider the relationship between genes, ignore the relationship between cells. We try to use the relationships between cells and the relationships between genes to construct clustering models.
We proposed scSDSC: a deep subspace cluster architecture that considers the relationships between genes and cells at the same time. Similar to deep subspace clustering (DSC), we added a fully connected layer after the embedding layer to obtain the self-expression matrix. In addition, we also added a fully connected SoftMax layer to generate the pseudo-label and used the information carried by the pseudo-label for model training. Finally, the affinity matrix is obtained for spectral clustering.
Experimental results on eight real datasets show that scSDSC outperforms existing methods in downstream analysis.
Our method plays an important role in improving clustering accuracy and downstream analysis.
期刊介绍:
Current Bioinformatics aims to publish all the latest and outstanding developments in bioinformatics. Each issue contains a series of timely, in-depth/mini-reviews, research papers and guest edited thematic issues written by leaders in the field, covering a wide range of the integration of biology with computer and information science.
The journal focuses on advances in computational molecular/structural biology, encompassing areas such as computing in biomedicine and genomics, computational proteomics and systems biology, and metabolic pathway engineering. Developments in these fields have direct implications on key issues related to health care, medicine, genetic disorders, development of agricultural products, renewable energy, environmental protection, etc.