{"title":"Recurrent Composite Markers of Cell Types and States.","authors":"Xubin Li, Justin Nguyen, Anil Korkut","doi":"10.1101/2023.07.17.549344","DOIUrl":null,"url":null,"abstract":"<p><p>Biological function is mediated by the hierarchical organization of cell types and states within tissue ecosystems. Identifying interpretable composite marker sets that both define and distinguish hierarchical cell identities is essential for decoding biological complexity, yet remains a major challenge. Here, we present RECOMBINE, an algorithm that identifies recurrent composite marker sets to define hierarchical cell identities. Validation using both simulated and biological datasets demonstrates that RECOMBINE achieves higher accuracy in identifying discriminative markers compared to existing approaches, including differential gene expression analysis. When applied to single-cell data and validated with spatial transcriptomics data from the mouse visual cortex, RECOMBINE identified key cell type markers and generated a robust gene panel for targeted spatial profiling. It also uncovered markers of CD8+; T cell states, including GZMK+;HAVCR2-; effector memory cells associated with anti-PD-1 therapy response, and revealed a rare intestinal subpopulation with composite markers in mice. Finally, using data from the Tabula Sapiens project, RECOMBINE identified composite marker sets across a broad range of human tissues. Together, these results highlight RECOMBINE as a robust, data-driven framework for optimized marker selection, enabling the discovery and validation of hierarchical cell identities across diverse tissue contexts.</p>","PeriodicalId":72407,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10370072/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv : the preprint server for biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2023.07.17.549344","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Biological function is mediated by the hierarchical organization of cell types and states within tissue ecosystems. Identifying interpretable composite marker sets that both define and distinguish hierarchical cell identities is essential for decoding biological complexity, yet remains a major challenge. Here, we present RECOMBINE, an algorithm that identifies recurrent composite marker sets to define hierarchical cell identities. Validation using both simulated and biological datasets demonstrates that RECOMBINE achieves higher accuracy in identifying discriminative markers compared to existing approaches, including differential gene expression analysis. When applied to single-cell data and validated with spatial transcriptomics data from the mouse visual cortex, RECOMBINE identified key cell type markers and generated a robust gene panel for targeted spatial profiling. It also uncovered markers of CD8+; T cell states, including GZMK+;HAVCR2-; effector memory cells associated with anti-PD-1 therapy response, and revealed a rare intestinal subpopulation with composite markers in mice. Finally, using data from the Tabula Sapiens project, RECOMBINE identified composite marker sets across a broad range of human tissues. Together, these results highlight RECOMBINE as a robust, data-driven framework for optimized marker selection, enabling the discovery and validation of hierarchical cell identities across diverse tissue contexts.