The Bayesian Phylogenetic Bootstrap, Application to Short Trees and Branches.

IF 11 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Frédéric Lemoine, Olivier Gascuel
{"title":"The Bayesian Phylogenetic Bootstrap, Application to Short Trees and Branches.","authors":"Frédéric Lemoine, Olivier Gascuel","doi":"10.1093/molbev/msae238","DOIUrl":null,"url":null,"abstract":"<p><p>Felsenstein's bootstrap is the most commonly used method to measure branch support in phylogenetics. Current sequencing technologies can result in massive sampling of taxa (e.g. SARS-CoV-2). In this case, the sequences are very similar, the trees are short, and the branches correspond to a small number of mutations (possibly 0). Nevertheless, these trees contain a strong signal, with unresolved parts but a low rate of false branches. With such data, Felsenstein's bootstrap is not satisfactory. Due to the frequentist nature of bootstrap sampling, the expected support of a branch corresponding to a single mutation is ∼63%, even though it is highly likely to be correct. Here we propose a Bayesian version of the phylogenetic bootstrap in which sites are assigned uninformative prior probabilities. The branch support can then be interpreted as a posterior probability. We do not view the alignment as a small subsample of a large sample of sites, but rather as containing all available information (e.g., as with complete viral genomes, which are becoming routine). We give formulas for expected supports under the assumption of perfect phylogeny, in both the frequentist and Bayesian frameworks, where a branch corresponding to a single mutation now has an expected support of ∼90%. Simulations show that these theoretical results are robust to realistic data. Analyses on low-homoplasy viral and non-viral datasets show that Bayesian bootstrap support is easier to interpret, with high supports for branches very likely to be correct. As homoplasy increases, the two supports become closer and strongly correlated.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":11.0000,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular biology and evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/molbev/msae238","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Felsenstein's bootstrap is the most commonly used method to measure branch support in phylogenetics. Current sequencing technologies can result in massive sampling of taxa (e.g. SARS-CoV-2). In this case, the sequences are very similar, the trees are short, and the branches correspond to a small number of mutations (possibly 0). Nevertheless, these trees contain a strong signal, with unresolved parts but a low rate of false branches. With such data, Felsenstein's bootstrap is not satisfactory. Due to the frequentist nature of bootstrap sampling, the expected support of a branch corresponding to a single mutation is ∼63%, even though it is highly likely to be correct. Here we propose a Bayesian version of the phylogenetic bootstrap in which sites are assigned uninformative prior probabilities. The branch support can then be interpreted as a posterior probability. We do not view the alignment as a small subsample of a large sample of sites, but rather as containing all available information (e.g., as with complete viral genomes, which are becoming routine). We give formulas for expected supports under the assumption of perfect phylogeny, in both the frequentist and Bayesian frameworks, where a branch corresponding to a single mutation now has an expected support of ∼90%. Simulations show that these theoretical results are robust to realistic data. Analyses on low-homoplasy viral and non-viral datasets show that Bayesian bootstrap support is easier to interpret, with high supports for branches very likely to be correct. As homoplasy increases, the two supports become closer and strongly correlated.

贝叶斯系统发育引导法,短树和分支的应用。
费尔森斯泰因引导法(Felsenstein's bootstrap)是系统发生学中测量分支支持率最常用的方法。目前的测序技术可以对分类群进行大量采样(如 SARS-CoV-2)。在这种情况下,序列非常相似,树很短,分支对应的突变数量很少(可能为 0)。然而,这些树含有强烈的信号,有未解决的部分,但错误分支率很低。对于这样的数据,费尔森斯坦的引导法并不令人满意。由于自举取样的频繁性,与单个突变相对应的分支的预期支持率为 63%,尽管它很可能是正确的。在这里,我们提出了贝叶斯版本的系统发育自举法,其中的位点被赋予了无信息的先验概率。分支支持率可以解释为后验概率。我们不把比对看作是大量位点样本中的一个小的子样本,而是把它看作包含了所有可用信息(例如,完整的病毒基因组,这已成为常规)。我们给出了完美系统发育假设下的预期支持率公式,在频数主义和贝叶斯框架下,对应于单个突变的分支现在的预期支持率为 ∼90%。模拟结果表明,这些理论结果对现实数据是可靠的。对低同源性病毒和非病毒数据集的分析表明,贝叶斯引导支持率更容易解释,高支持率的分支很可能是正确的。随着同源性的增加,这两种支持率变得更加接近,并具有很强的相关性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Molecular biology and evolution
Molecular biology and evolution 生物-进化生物学
CiteScore
19.70
自引率
3.70%
发文量
257
审稿时长
1 months
期刊介绍: Molecular Biology and Evolution Journal Overview: Publishes research at the interface of molecular (including genomics) and evolutionary biology Considers manuscripts containing patterns, processes, and predictions at all levels of organization: population, taxonomic, functional, and phenotypic Interested in fundamental discoveries, new and improved methods, resources, technologies, and theories advancing evolutionary research Publishes balanced reviews of recent developments in genome evolution and forward-looking perspectives suggesting future directions in molecular evolution applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信