MarkerMap:用于单细胞研究的非线性标记选择。

IF 3.5 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Wilson Gregory, Nabeel Sarwar, George Kevrekidis, Soledad Villar, Bianca Dumitrascu
{"title":"MarkerMap:用于单细胞研究的非线性标记选择。","authors":"Wilson Gregory, Nabeel Sarwar, George Kevrekidis, Soledad Villar, Bianca Dumitrascu","doi":"10.1038/s41540-024-00339-3","DOIUrl":null,"url":null,"abstract":"<p><p>Single-cell RNA-seq data allow the quantification of cell type differences across a growing set of biological contexts. However, pinpointing a small subset of genomic features explaining this variability can be ill-defined and computationally intractable. Here we introduce MarkerMap, a generative model for selecting minimal gene sets which are maximally informative of cell type origin and enable whole transcriptome reconstruction. MarkerMap provides a scalable framework for both supervised marker selection, aimed at identifying specific cell type populations, and unsupervised marker selection, aimed at gene expression imputation and reconstruction. We benchmark MarkerMap's competitive performance against previously published approaches on real single cell gene expression data sets. MarkerMap is available as a pip installable package, as a community resource aimed at developing explainable machine learning techniques for enhancing interpretability in single-cell studies.</p>","PeriodicalId":19345,"journal":{"name":"NPJ Systems Biology and Applications","volume":null,"pages":null},"PeriodicalIF":3.5000,"publicationDate":"2024-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10864304/pdf/","citationCount":"0","resultStr":"{\"title\":\"MarkerMap: nonlinear marker selection for single-cell studies.\",\"authors\":\"Wilson Gregory, Nabeel Sarwar, George Kevrekidis, Soledad Villar, Bianca Dumitrascu\",\"doi\":\"10.1038/s41540-024-00339-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Single-cell RNA-seq data allow the quantification of cell type differences across a growing set of biological contexts. However, pinpointing a small subset of genomic features explaining this variability can be ill-defined and computationally intractable. Here we introduce MarkerMap, a generative model for selecting minimal gene sets which are maximally informative of cell type origin and enable whole transcriptome reconstruction. MarkerMap provides a scalable framework for both supervised marker selection, aimed at identifying specific cell type populations, and unsupervised marker selection, aimed at gene expression imputation and reconstruction. We benchmark MarkerMap's competitive performance against previously published approaches on real single cell gene expression data sets. MarkerMap is available as a pip installable package, as a community resource aimed at developing explainable machine learning techniques for enhancing interpretability in single-cell studies.</p>\",\"PeriodicalId\":19345,\"journal\":{\"name\":\"NPJ Systems Biology and Applications\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-02-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10864304/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NPJ Systems Biology and Applications\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1038/s41540-024-00339-3\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NPJ Systems Biology and Applications","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1038/s41540-024-00339-3","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

单细胞 RNA 序列数据可以量化越来越多生物环境中的细胞类型差异。然而,精确定位一小部分基因组特征来解释这种变异性可能是不明确的,而且在计算上也很难实现。在这里,我们介绍一种生成模型 MarkerMap,用于选择最小的基因集,这些基因集能最大程度地反映细胞类型的起源,并能重建整个转录组。MarkerMap 提供了一个可扩展的框架,既可用于有监督的标记选择(旨在识别特定的细胞类型群),也可用于无监督的标记选择(旨在基因表达归因和重建)。我们在真实的单细胞基因表达数据集上对 MarkerMap 的性能与以前发表的方法进行了比较。MarkerMap 可作为 pip 安装包提供,是一种社区资源,旨在开发可解释的机器学习技术,以提高单细胞研究的可解释性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

MarkerMap: nonlinear marker selection for single-cell studies.

MarkerMap: nonlinear marker selection for single-cell studies.

Single-cell RNA-seq data allow the quantification of cell type differences across a growing set of biological contexts. However, pinpointing a small subset of genomic features explaining this variability can be ill-defined and computationally intractable. Here we introduce MarkerMap, a generative model for selecting minimal gene sets which are maximally informative of cell type origin and enable whole transcriptome reconstruction. MarkerMap provides a scalable framework for both supervised marker selection, aimed at identifying specific cell type populations, and unsupervised marker selection, aimed at gene expression imputation and reconstruction. We benchmark MarkerMap's competitive performance against previously published approaches on real single cell gene expression data sets. MarkerMap is available as a pip installable package, as a community resource aimed at developing explainable machine learning techniques for enhancing interpretability in single-cell studies.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
NPJ Systems Biology and Applications
NPJ Systems Biology and Applications Mathematics-Applied Mathematics
CiteScore
5.80
自引率
0.00%
发文量
46
审稿时长
8 weeks
期刊介绍: npj Systems Biology and Applications is an online Open Access journal dedicated to publishing the premier research that takes a systems-oriented approach. The journal aims to provide a forum for the presentation of articles that help define this nascent field, as well as those that apply the advances to wider fields. We encourage studies that integrate, or aid the integration of, data, analyses and insight from molecules to organisms and broader systems. Important areas of interest include not only fundamental biological systems and drug discovery, but also applications to health, medical practice and implementation, big data, biotechnology, food science, human behaviour, broader biological systems and industrial applications of systems biology. We encourage all approaches, including network biology, application of control theory to biological systems, computational modelling and analysis, comprehensive and/or high-content measurements, theoretical, analytical and computational studies of system-level properties of biological systems and computational/software/data platforms enabling such studies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信