导航抽样偏差在离散系统地理分析:评估一个调整贝叶斯因素的性能。

IF 5.3 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Fabiana Gámbaro, Maylis Layan, Guy Baele, Bram Vrancken, Simon Dellicour
{"title":"导航抽样偏差在离散系统地理分析:评估一个调整贝叶斯因素的性能。","authors":"Fabiana Gámbaro, Maylis Layan, Guy Baele, Bram Vrancken, Simon Dellicour","doi":"10.1093/molbev/msaf253","DOIUrl":null,"url":null,"abstract":"<p><p>Bayesian phylogeographic inference is widely used in molecular epidemiological studies to reconstruct the dispersal history of pathogens. Discrete phylogeographic analysis treats geographic locations as discrete traits and infers lineage transition events among them, and is typically followed by a Bayes factor (BF) test to assess the statistical support. In the standard BF (BFstd) test, the relative abundance of the involved trait states is not considered, which can be problematic in the case of unbalanced sampling. Existing methods to correct sampling bias in discrete phylogeographic analyses using continuous-time Markov chain (CTMC) model, often require additional epidemiological information to balance the sampling effort among locations. As such data is not necessarily available, alternative approaches that rely solely on available genomic data are needed. In this perspective, we assess the performance of a modification of the BFstd, the adjusted Bayes factor (BFadj), which incorporates information on the relative abundance of samples by location when inferring support for transition events and root location inference without requiring additional data. Using a simulation framework, we assess the statistical performance of BFstd and BFadj under varying levels of sampling bias, estimating their type I and type II error rates. Our results show that BFadj complements the BFstd by reducing type I errors at the cost increasing type II errors for inferred transition events, while improving type I and type II errors in root location inference. Our findings provide guidelines for implementing the complementary BFadj to detect and mitigate sampling bias in discrete phylogeographic inference using CTMC modelling.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":5.3000,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Navigating sampling bias in discrete phylogeographic analysis: assessing the performance of an adjusted Bayes factor.\",\"authors\":\"Fabiana Gámbaro, Maylis Layan, Guy Baele, Bram Vrancken, Simon Dellicour\",\"doi\":\"10.1093/molbev/msaf253\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Bayesian phylogeographic inference is widely used in molecular epidemiological studies to reconstruct the dispersal history of pathogens. Discrete phylogeographic analysis treats geographic locations as discrete traits and infers lineage transition events among them, and is typically followed by a Bayes factor (BF) test to assess the statistical support. In the standard BF (BFstd) test, the relative abundance of the involved trait states is not considered, which can be problematic in the case of unbalanced sampling. Existing methods to correct sampling bias in discrete phylogeographic analyses using continuous-time Markov chain (CTMC) model, often require additional epidemiological information to balance the sampling effort among locations. As such data is not necessarily available, alternative approaches that rely solely on available genomic data are needed. In this perspective, we assess the performance of a modification of the BFstd, the adjusted Bayes factor (BFadj), which incorporates information on the relative abundance of samples by location when inferring support for transition events and root location inference without requiring additional data. Using a simulation framework, we assess the statistical performance of BFstd and BFadj under varying levels of sampling bias, estimating their type I and type II error rates. Our results show that BFadj complements the BFstd by reducing type I errors at the cost increasing type II errors for inferred transition events, while improving type I and type II errors in root location inference. Our findings provide guidelines for implementing the complementary BFadj to detect and mitigate sampling bias in discrete phylogeographic inference using CTMC modelling.</p>\",\"PeriodicalId\":18730,\"journal\":{\"name\":\"Molecular biology and evolution\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":5.3000,\"publicationDate\":\"2025-10-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular biology and evolution\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/molbev/msaf253\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular biology and evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/molbev/msaf253","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

贝叶斯系统地理推断在分子流行病学研究中被广泛用于重建病原体的传播历史。离散系统地理分析将地理位置视为离散的特征,并推断它们之间的谱系过渡事件,然后通常采用贝叶斯因子(BF)检验来评估统计支持度。在标准BF (BFstd)测试中,所涉及的特征状态的相对丰度没有被考虑,这在不平衡抽样的情况下可能会出现问题。利用连续时间马尔可夫链(CTMC)模型纠正离散系统地理分析中抽样偏差的现有方法通常需要额外的流行病学信息来平衡不同地点的抽样工作。由于这些数据不一定可用,因此需要完全依赖现有基因组数据的替代方法。从这个角度来看,我们评估了BFstd修正的性能,即调整后的贝叶斯因子(BFadj),它在推断过渡事件的支持度和根位置推断时,结合了样本的相对丰度信息,而不需要额外的数据。利用仿真框架,我们评估了BFstd和BFadj在不同采样偏差水平下的统计性能,估计了它们的I型和II型错误率。我们的研究结果表明,BFadj对BFstd进行了补充,减少了I类错误,增加了推断转移事件的II类错误,同时改善了根位置推断的I类和II类错误。我们的研究结果为使用CTMC模型实现互补BFadj来检测和减轻离散系统地理推断中的抽样偏差提供了指导。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Navigating sampling bias in discrete phylogeographic analysis: assessing the performance of an adjusted Bayes factor.

Bayesian phylogeographic inference is widely used in molecular epidemiological studies to reconstruct the dispersal history of pathogens. Discrete phylogeographic analysis treats geographic locations as discrete traits and infers lineage transition events among them, and is typically followed by a Bayes factor (BF) test to assess the statistical support. In the standard BF (BFstd) test, the relative abundance of the involved trait states is not considered, which can be problematic in the case of unbalanced sampling. Existing methods to correct sampling bias in discrete phylogeographic analyses using continuous-time Markov chain (CTMC) model, often require additional epidemiological information to balance the sampling effort among locations. As such data is not necessarily available, alternative approaches that rely solely on available genomic data are needed. In this perspective, we assess the performance of a modification of the BFstd, the adjusted Bayes factor (BFadj), which incorporates information on the relative abundance of samples by location when inferring support for transition events and root location inference without requiring additional data. Using a simulation framework, we assess the statistical performance of BFstd and BFadj under varying levels of sampling bias, estimating their type I and type II error rates. Our results show that BFadj complements the BFstd by reducing type I errors at the cost increasing type II errors for inferred transition events, while improving type I and type II errors in root location inference. Our findings provide guidelines for implementing the complementary BFadj to detect and mitigate sampling bias in discrete phylogeographic inference using CTMC modelling.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Molecular biology and evolution
Molecular biology and evolution 生物-进化生物学
CiteScore
19.70
自引率
3.70%
发文量
257
审稿时长
1 months
期刊介绍: Molecular Biology and Evolution Journal Overview: Publishes research at the interface of molecular (including genomics) and evolutionary biology Considers manuscripts containing patterns, processes, and predictions at all levels of organization: population, taxonomic, functional, and phenotypic Interested in fundamental discoveries, new and improved methods, resources, technologies, and theories advancing evolutionary research Publishes balanced reviews of recent developments in genome evolution and forward-looking perspectives suggesting future directions in molecular evolution applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信