A semi-supervised Bayesian approach for marker gene trajectory inference from single-cell RNA-seq data.

IF 5.4
Junchao Wang, Ling Sun, Nana Wei, Yisheng Huang, Naiqian Zhang
{"title":"A semi-supervised Bayesian approach for marker gene trajectory inference from single-cell RNA-seq data.","authors":"Junchao Wang, Ling Sun, Nana Wei, Yisheng Huang, Naiqian Zhang","doi":"10.1093/bioinformatics/btaf454","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Trajectory inference methods are essential for extracting temporal ordering from static single-cell transcriptomic profiles, thus facilitating the accurate delineation of cellular developmental hierarchies and cell-fate transitions. However, numerous existing methods treat trajectory inference as an unsupervised learning task, rendering them susceptible to technical noise and data sparsity, which often lead to unstable reconstructions and ambiguous lineage assignments.</p><p><strong>Results: </strong>Here, we introduce BayesTraj, a semi-supervised Bayesian framework that incorporates prior knowledge of lineage topology and marker-gene expression to robustly reconstruct differentiation trajectories from scRNA-seq data. BayesTraj models cellular differentiation as a probabilistic mixture of latent lineages and captures marker-gene dynamics through parametric functions. Posterior inference is conducted using Hamiltonian Monte Carlo (HMC), yielding estimates of pseudotime, lineage proportions, and gene activation parameters. Evaluations on both simulated and real datasets with diverse branching structures demonstrate that BayesTraj consistently outperforms state-of-the-art methods in pseudotime inference. In addition, it provides per-cell branch-assignment probabilities, enabling the quantification of differentiation potential using Shannon entropy and the detection of lineage-specific gene expression via Bayesian model comparison.</p><p><strong>Availability and implementation: </strong>BayesTraj is written in R and available at https://github.com/SDU-W-Zhanglab/BayesTraj and has been archived on Zenodo (DOI: 10.5281/zenodo.16758038).</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12410927/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf454","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Motivation: Trajectory inference methods are essential for extracting temporal ordering from static single-cell transcriptomic profiles, thus facilitating the accurate delineation of cellular developmental hierarchies and cell-fate transitions. However, numerous existing methods treat trajectory inference as an unsupervised learning task, rendering them susceptible to technical noise and data sparsity, which often lead to unstable reconstructions and ambiguous lineage assignments.

Results: Here, we introduce BayesTraj, a semi-supervised Bayesian framework that incorporates prior knowledge of lineage topology and marker-gene expression to robustly reconstruct differentiation trajectories from scRNA-seq data. BayesTraj models cellular differentiation as a probabilistic mixture of latent lineages and captures marker-gene dynamics through parametric functions. Posterior inference is conducted using Hamiltonian Monte Carlo (HMC), yielding estimates of pseudotime, lineage proportions, and gene activation parameters. Evaluations on both simulated and real datasets with diverse branching structures demonstrate that BayesTraj consistently outperforms state-of-the-art methods in pseudotime inference. In addition, it provides per-cell branch-assignment probabilities, enabling the quantification of differentiation potential using Shannon entropy and the detection of lineage-specific gene expression via Bayesian model comparison.

Availability and implementation: BayesTraj is written in R and available at https://github.com/SDU-W-Zhanglab/BayesTraj and has been archived on Zenodo (DOI: 10.5281/zenodo.16758038).

从单细胞RNA-Seq数据推断标记基因轨迹的半监督贝叶斯方法。
动机:轨迹推断方法对于从静态单细胞转录组谱中提取时间顺序至关重要,从而有助于准确描述细胞发育等级和细胞命运转变。然而,许多现有方法将轨迹推断视为无监督学习任务,使其容易受到技术噪声和数据稀疏性的影响,这往往导致不稳定的重建和模糊的谱系分配。在这里,我们引入了BayesTraj,这是一个半监督贝叶斯框架,结合了谱系拓扑和标记基因表达的先验知识,可以从scRNA-seq数据中稳健地重建分化轨迹。BayesTraj将细胞分化建模为潜在谱系的概率混合,并通过参数函数捕获标记-基因动态。后验推理使用哈密顿蒙特卡罗(HMC)进行,产生伪时间,谱系比例和基因激活参数的估计。对具有不同分支结构的模拟和真实数据集的评估表明,BayesTraj在伪时间推理方面始终优于最先进的方法。此外,它还提供了每个细胞的分支分配概率,从而可以使用香农熵来量化分化潜力,并通过贝叶斯模型比较来检测谱系特异性基因表达。可用性:BayesTraj是用R编写的,可从https://github.com/SDU-W-Zhanglab/BayesTraj获得,并已在Zenodo上存档(DOI: 10.5281/ Zenodo .16758038)。补充信息:补充数据可在生物信息学在线获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信