Nonparametric Bayesian multiarmed bandits for single-cell experiment design

F. Camerlenghi, Bianca Dumitrascu, F. Ferrari, B. Engelhardt, S. Favaro
{"title":"Nonparametric Bayesian multiarmed bandits for single-cell experiment design","authors":"F. Camerlenghi, Bianca Dumitrascu, F. Ferrari, B. Engelhardt, S. Favaro","doi":"10.1214/20-aoas1370","DOIUrl":null,"url":null,"abstract":"The problem of maximizing cell type discovery under budget constraints is a fundamental challenge in the collection and the analysis of single-cell RNA-sequencing (scRNA-seq) data. In this paper, we introduce a simple, computationally efficient, and scalable Bayesian nonparametric sequential approach to optimize the budget allocation when designing a large scale collection of scRNA-seq data for the purpose of, but not limited to, creating cell atlases. Our approach relies on i) a hierarchical Pitman-Yor prior that recapitulates biological assumptions regarding cellular differentiation, and ii) a Thompson sampling multi-armed bandit strategy that balances exploitation and exploration to prioritize experiments across a sequence of trials. Posterior inference is performed using a sequential Monte Carlo approach, which allows us to fully exploit the sequential nature of our species sampling problem. We empirically show that our approach outperforms state-of-the-art methods and achieves near-Oracle performance on simulated and real data alike. HPY-TS code is available at this https URL.","PeriodicalId":409996,"journal":{"name":"arXiv: Applications","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv: Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1214/20-aoas1370","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

The problem of maximizing cell type discovery under budget constraints is a fundamental challenge in the collection and the analysis of single-cell RNA-sequencing (scRNA-seq) data. In this paper, we introduce a simple, computationally efficient, and scalable Bayesian nonparametric sequential approach to optimize the budget allocation when designing a large scale collection of scRNA-seq data for the purpose of, but not limited to, creating cell atlases. Our approach relies on i) a hierarchical Pitman-Yor prior that recapitulates biological assumptions regarding cellular differentiation, and ii) a Thompson sampling multi-armed bandit strategy that balances exploitation and exploration to prioritize experiments across a sequence of trials. Posterior inference is performed using a sequential Monte Carlo approach, which allows us to fully exploit the sequential nature of our species sampling problem. We empirically show that our approach outperforms state-of-the-art methods and achieves near-Oracle performance on simulated and real data alike. HPY-TS code is available at this https URL.
单细胞实验设计的非参数贝叶斯多臂强盗
在预算限制下最大限度地发现细胞类型的问题是单细胞rna测序(scRNA-seq)数据收集和分析中的一个基本挑战。在本文中,我们介绍了一种简单、计算效率高、可扩展的贝叶斯非参数序列方法,用于在设计大规模scRNA-seq数据集(但不限于创建细胞图谱)时优化预算分配。我们的方法依赖于i)一个分层的Pitman-Yor先验,它概括了关于细胞分化的生物学假设,以及ii)一个汤普森采样多臂强盗策略,它平衡了开发和探索,从而在一系列试验中优先考虑实验。后验推理使用顺序蒙特卡罗方法进行,这使我们能够充分利用物种抽样问题的顺序性质。我们的经验表明,我们的方法优于最先进的方法,并在模拟和真实数据上实现了接近oracle的性能。HPY-TS代码可在此https URL中获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信