scEVE: a single-cell RNA-seq ensemble clustering algorithm capitalizing on the differences of predictions between multiple clustering methods.

IF 2.8 Q1 GENETICS & HEREDITY

NAR Genomics and Bioinformatics Pub Date : 2025-06-09 eCollection Date: 2025-06-01 DOI:10.1093/nargab/lqaf073

Yanis Asloudj, Fleur Mougin, Patricia Thébault

{"title":"scEVE: a single-cell RNA-seq ensemble clustering algorithm capitalizing on the differences of predictions between multiple clustering methods.","authors":"Yanis Asloudj, Fleur Mougin, Patricia Thébault","doi":"10.1093/nargab/lqaf073","DOIUrl":null,"url":null,"abstract":"<p><p>Single-cell RNA sequencing measures individual cell transcriptomes in a sample. In the past decade, this technology has motivated the development of hundreds of clustering methods. These methods attempt to group cells into populations by leveraging the similarity of their transcriptomes. Because each method relies on specific hypotheses, their predictions can vary drastically. To address this issue, ensemble algorithms detect cell populations by integrating multiple clustering methods, and minimizing the differences of their predictions. While this approach is sensible, it has yet to address some conceptual challenges in single-cell data science; namely, ensemble algorithms have yet to generate clustering results with uncertainty values and multiple resolutions. In this work, we present an original approach to ensemble clustering that addresses these challenges, by describing the differences between clustering results, rather than minimizing them. We present the scEVE algorithm, and we evaluate it on 15 experimental datasets, and up to 1200 synthetic datasets. Our results reveal that scEVE outperforms the state of the art, and addresses both conceptual challenges. We also highlight how biological downstream analyses will benefit from addressing these challenges. We expect that this work will provide an alternative direction for developing single-cell ensemble clustering algorithms.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 2","pages":"lqaf073"},"PeriodicalIF":2.8000,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12147100/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NAR Genomics and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/nargab/lqaf073","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}

引用次数: 0

Abstract

Single-cell RNA sequencing measures individual cell transcriptomes in a sample. In the past decade, this technology has motivated the development of hundreds of clustering methods. These methods attempt to group cells into populations by leveraging the similarity of their transcriptomes. Because each method relies on specific hypotheses, their predictions can vary drastically. To address this issue, ensemble algorithms detect cell populations by integrating multiple clustering methods, and minimizing the differences of their predictions. While this approach is sensible, it has yet to address some conceptual challenges in single-cell data science; namely, ensemble algorithms have yet to generate clustering results with uncertainty values and multiple resolutions. In this work, we present an original approach to ensemble clustering that addresses these challenges, by describing the differences between clustering results, rather than minimizing them. We present the scEVE algorithm, and we evaluate it on 15 experimental datasets, and up to 1200 synthetic datasets. Our results reveal that scEVE outperforms the state of the art, and addresses both conceptual challenges. We also highlight how biological downstream analyses will benefit from addressing these challenges. We expect that this work will provide an alternative direction for developing single-cell ensemble clustering algorithms.

查看原文本刊更多论文

scEVE：一种单细胞RNA-seq集成聚类算法，利用多种聚类方法之间的预测差异。

单细胞RNA测序测量样本中的单个细胞转录组。在过去的十年中，这项技术推动了数百种聚类方法的发展。这些方法试图通过利用细胞转录组的相似性将细胞分组成群体。因为每种方法都依赖于特定的假设，它们的预测可能会有很大的不同。为了解决这个问题，集成算法通过集成多种聚类方法来检测细胞群，并最小化其预测的差异。虽然这种方法是明智的，但它尚未解决单细胞数据科学中的一些概念挑战；也就是说，集成算法尚未产生具有不确定性值和多分辨率的聚类结果。在这项工作中，我们提出了一种原始的集成聚类方法，通过描述聚类结果之间的差异，而不是最小化它们，来解决这些挑战。我们提出了scEVE算法，并在15个实验数据集和多达1200个合成数据集上对其进行了评估。我们的研究结果表明，scEVE优于目前的技术水平，并解决了这两个概念上的挑战。我们还强调了生物下游分析将如何从解决这些挑战中受益。我们期望这项工作将为开发单细胞集成聚类算法提供另一种方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊