Alexander Strzalkowski, Ron Zeira, Benjamin J Raphael
{"title":"Between Cluster Analysis: Supervised Dimensionality Reduction for Trajectory Inference.","authors":"Alexander Strzalkowski, Ron Zeira, Benjamin J Raphael","doi":"10.1093/bioinformatics/btaf306","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Single-cell RNA sequencing (scRNA-seq) measures the transcriptional state of individual cells, enabling more precise characterization of cell types, cell states, and developmental trajectories. Because of the high dimensionality of scRNA-seq data, a standard first step in scRNA-seq analysis is to perform dimensionality reduction. PCA and many other commonly used dimensionality reduction techniques are unsupervised, meaning that they do not incorporate any prior knowledge of the data being analyzed. On the other hand, nearly all trajectory inference methods are supervised, relying on information such as a clustering of cells into cell types/states.</p><p><strong>Results: </strong>We introduce Between Cluster Analysis (BCA), a supervised linear dimensionality reduction technique that uses cluster labels of cells as prior information and computes an embedding that maximizes the between cluster variance. We show on both simulated and real data that BCA improves trajectory inference compared to other dimensionality reduction methods, including Linear Discriminant Analysis (LDA), another supervised linear dimensionality reduction method. Additionally, we observe that many of the commonly used metrics to evaluate trajectory inference evaluate only the ordering of cell types and not the identification or ordering of intermediate cell states. We propose an alternative measure to evaluate trajectory inference methods in preserving intermediate cells, especially when the ordering of these intermediate cells is unknown.</p><p><strong>Availability: </strong>Code is available at https://github.com/raphael-group/BCA.</p><p><strong>Supplementary information: </strong>Supplementary data are available at Bioinformatics online.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf306","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Motivation: Single-cell RNA sequencing (scRNA-seq) measures the transcriptional state of individual cells, enabling more precise characterization of cell types, cell states, and developmental trajectories. Because of the high dimensionality of scRNA-seq data, a standard first step in scRNA-seq analysis is to perform dimensionality reduction. PCA and many other commonly used dimensionality reduction techniques are unsupervised, meaning that they do not incorporate any prior knowledge of the data being analyzed. On the other hand, nearly all trajectory inference methods are supervised, relying on information such as a clustering of cells into cell types/states.
Results: We introduce Between Cluster Analysis (BCA), a supervised linear dimensionality reduction technique that uses cluster labels of cells as prior information and computes an embedding that maximizes the between cluster variance. We show on both simulated and real data that BCA improves trajectory inference compared to other dimensionality reduction methods, including Linear Discriminant Analysis (LDA), another supervised linear dimensionality reduction method. Additionally, we observe that many of the commonly used metrics to evaluate trajectory inference evaluate only the ordering of cell types and not the identification or ordering of intermediate cell states. We propose an alternative measure to evaluate trajectory inference methods in preserving intermediate cells, especially when the ordering of these intermediate cells is unknown.
Availability: Code is available at https://github.com/raphael-group/BCA.
Supplementary information: Supplementary data are available at Bioinformatics online.