Yue Yu, Wei Zhang, Xiaoying Zheng, Juan Shen, Yuanyuan Li
{"title":"基于低秩矩阵分解和局部图正则化的单细胞RNA-Seq数据聚类。","authors":"Yue Yu, Wei Zhang, Xiaoying Zheng, Juan Shen, Yuanyuan Li","doi":"10.1007/s12539-025-00762-y","DOIUrl":null,"url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) offers significant opportunities to reveal cellular heterogeneity and diversity. Accurate cell type identification is critical for downstream analyses and understanding the mechanisms of heterogeneity. However, challenges arise from the high dimensionality, sparsity, and noise of scRNA-seq data. While various low-rank representation (LRR)-based clustering methods have been developed, many existing approaches may inaccurately capture relationships or conflate true patterns with noise. To address these limitations, we introduce a novel clustering algorithm that integrates low-rank matrix decomposition with local graph regularization (LRMGC). This approach applies a tri-decomposition strategy to the representation matrix to derive an aligned core matrix, and characterizes the \"distance\" between cells in a lower-dimensional space through a local manifold regularization term. Rather than relying on the kernel norm of the representation matrix, the Schatten p-norm is applied to the core matrix to robustly learn the similarity matrix against noise and outliers, while maintaining the high-dimensional noisy data's underlying subspace structure for accurate and robust clustering. Additionally, the final similarity matrix is obtained by applying the angular alignment strategy on the similarity matrix. Comprehensive experiments and comparisons with advanced methods on scRNA-seq datasets demonstrate LRMGC's superior performance and reliability in uncovering cell type composition. Furthermore, a variety of downstream analyses, such as marker gene identification, functional enrichment analysis, rare cell recognition, and cell-cell communication, also demonstrate the effectiveness of LRMGC.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Clustering Single-Cell RNA-Seq Data with Low-Rank Matrix Factorization and Local Graph Regularization.\",\"authors\":\"Yue Yu, Wei Zhang, Xiaoying Zheng, Juan Shen, Yuanyuan Li\",\"doi\":\"10.1007/s12539-025-00762-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Single-cell RNA sequencing (scRNA-seq) offers significant opportunities to reveal cellular heterogeneity and diversity. Accurate cell type identification is critical for downstream analyses and understanding the mechanisms of heterogeneity. However, challenges arise from the high dimensionality, sparsity, and noise of scRNA-seq data. While various low-rank representation (LRR)-based clustering methods have been developed, many existing approaches may inaccurately capture relationships or conflate true patterns with noise. To address these limitations, we introduce a novel clustering algorithm that integrates low-rank matrix decomposition with local graph regularization (LRMGC). This approach applies a tri-decomposition strategy to the representation matrix to derive an aligned core matrix, and characterizes the \\\"distance\\\" between cells in a lower-dimensional space through a local manifold regularization term. Rather than relying on the kernel norm of the representation matrix, the Schatten p-norm is applied to the core matrix to robustly learn the similarity matrix against noise and outliers, while maintaining the high-dimensional noisy data's underlying subspace structure for accurate and robust clustering. Additionally, the final similarity matrix is obtained by applying the angular alignment strategy on the similarity matrix. Comprehensive experiments and comparisons with advanced methods on scRNA-seq datasets demonstrate LRMGC's superior performance and reliability in uncovering cell type composition. Furthermore, a variety of downstream analyses, such as marker gene identification, functional enrichment analysis, rare cell recognition, and cell-cell communication, also demonstrate the effectiveness of LRMGC.</p>\",\"PeriodicalId\":13670,\"journal\":{\"name\":\"Interdisciplinary Sciences: Computational Life Sciences\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Interdisciplinary Sciences: Computational Life Sciences\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1007/s12539-025-00762-y\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Interdisciplinary Sciences: Computational Life Sciences","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s12539-025-00762-y","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
Clustering Single-Cell RNA-Seq Data with Low-Rank Matrix Factorization and Local Graph Regularization.
Single-cell RNA sequencing (scRNA-seq) offers significant opportunities to reveal cellular heterogeneity and diversity. Accurate cell type identification is critical for downstream analyses and understanding the mechanisms of heterogeneity. However, challenges arise from the high dimensionality, sparsity, and noise of scRNA-seq data. While various low-rank representation (LRR)-based clustering methods have been developed, many existing approaches may inaccurately capture relationships or conflate true patterns with noise. To address these limitations, we introduce a novel clustering algorithm that integrates low-rank matrix decomposition with local graph regularization (LRMGC). This approach applies a tri-decomposition strategy to the representation matrix to derive an aligned core matrix, and characterizes the "distance" between cells in a lower-dimensional space through a local manifold regularization term. Rather than relying on the kernel norm of the representation matrix, the Schatten p-norm is applied to the core matrix to robustly learn the similarity matrix against noise and outliers, while maintaining the high-dimensional noisy data's underlying subspace structure for accurate and robust clustering. Additionally, the final similarity matrix is obtained by applying the angular alignment strategy on the similarity matrix. Comprehensive experiments and comparisons with advanced methods on scRNA-seq datasets demonstrate LRMGC's superior performance and reliability in uncovering cell type composition. Furthermore, a variety of downstream analyses, such as marker gene identification, functional enrichment analysis, rare cell recognition, and cell-cell communication, also demonstrate the effectiveness of LRMGC.
期刊介绍:
Interdisciplinary Sciences--Computational Life Sciences aims to cover the most recent and outstanding developments in interdisciplinary areas of sciences, especially focusing on computational life sciences, an area that is enjoying rapid development at the forefront of scientific research and technology.
The journal publishes original papers of significant general interest covering recent research and developments. Articles will be published rapidly by taking full advantage of internet technology for online submission and peer-reviewing of manuscripts, and then by publishing OnlineFirstTM through SpringerLink even before the issue is built or sent to the printer.
The editorial board consists of many leading scientists with international reputation, among others, Luc Montagnier (UNESCO, France), Dennis Salahub (University of Calgary, Canada), Weitao Yang (Duke University, USA). Prof. Dongqing Wei at the Shanghai Jiatong University is appointed as the editor-in-chief; he made important contributions in bioinformatics and computational physics and is best known for his ground-breaking works on the theory of ferroelectric liquids. With the help from a team of associate editors and the editorial board, an international journal with sound reputation shall be created.