Hierarchical Heuristic Species Delimitation under the Multispecies Coalescent Model with Migration

IF 6.1 1区 生物学 Q1 EVOLUTIONARY BIOLOGY
Daniel Kornai, Xiyun Jiao, Jiayi Ji, Tomáš Flouri, Ziheng Yang
{"title":"Hierarchical Heuristic Species Delimitation under the Multispecies Coalescent Model with Migration","authors":"Daniel Kornai, Xiyun Jiao, Jiayi Ji, Tomáš Flouri, Ziheng Yang","doi":"10.1093/sysbio/syae050","DOIUrl":null,"url":null,"abstract":"The multispecies coalescent (MSC) model accommodates genealogical fluctuations across the genome and provides a natural framework for comparative analysis of genomic sequence data from closely related species to infer the history of species divergence and gene flow. Given a set of populations, hypotheses of species delimitation (and species phylogeny) may be formulated as instances of MSC models (e.g., MSC for one species versus MSC for two species) and compared using Bayesian model selection. This approach, implemented in the program bpp, has been found to be prone to over-splitting. Alternatively heuristic criteria based on population parameters (such as popula- tion split times, population sizes, and migration rates) estimated from genomic data may be used to delimit species. Here we develop hierarchical merge and split algorithms for heuristic species delimitation based on the genealogical divergence index (𝑔𝑑𝑖) and implement them in a python pipeline called hhsd. We characterize the behavior of the 𝑔𝑑𝑖 under a few simple scenarios of gene flow. We apply the new approaches to a dataset simulated under a model of isolation by distance as well as three empirical datasets. Our tests suggest that the new approaches produced sensible results and were less prone to over-splitting. We discuss possible strategies for accommodating paraphyletic species in the hierarchical algorithm, as well as the challenges of species delimitation based on heuristic criteria.","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":null,"pages":null},"PeriodicalIF":6.1000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systematic Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/sysbio/syae050","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EVOLUTIONARY BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The multispecies coalescent (MSC) model accommodates genealogical fluctuations across the genome and provides a natural framework for comparative analysis of genomic sequence data from closely related species to infer the history of species divergence and gene flow. Given a set of populations, hypotheses of species delimitation (and species phylogeny) may be formulated as instances of MSC models (e.g., MSC for one species versus MSC for two species) and compared using Bayesian model selection. This approach, implemented in the program bpp, has been found to be prone to over-splitting. Alternatively heuristic criteria based on population parameters (such as popula- tion split times, population sizes, and migration rates) estimated from genomic data may be used to delimit species. Here we develop hierarchical merge and split algorithms for heuristic species delimitation based on the genealogical divergence index (𝑔𝑑𝑖) and implement them in a python pipeline called hhsd. We characterize the behavior of the 𝑔𝑑𝑖 under a few simple scenarios of gene flow. We apply the new approaches to a dataset simulated under a model of isolation by distance as well as three empirical datasets. Our tests suggest that the new approaches produced sensible results and were less prone to over-splitting. We discuss possible strategies for accommodating paraphyletic species in the hierarchical algorithm, as well as the challenges of species delimitation based on heuristic criteria.
多物种聚合模型下的分层启发式物种划分与迁移
多物种聚合(MSC)模型可容纳整个基因组的谱系波动,并为近缘物种基因组序列数据的比较分析提供了一个自然框架,以推断物种分化和基因流动的历史。给定一组种群,物种划界(和物种系统发育)的假设可以表述为 MSC 模型的实例(例如,一个物种的 MSC 与两个物种的 MSC),并使用贝叶斯模型选择法进行比较。这种方法已在 bpp 程序中实现,但发现容易造成过度分裂。另一种方法是根据基因组数据估算出的种群参数(如种群分裂时间、种群大小和迁移率),采用启发式标准来划分物种。在此,我们基于系谱学分歧指数(𝑔𝑑𝑖)开发了启发式物种划界的分层合并与拆分算法,并在名为 hhsd 的 python 管道中加以实现。我们描述了几种简单的基因流动情况下 𝑔𝑖𝑑的行为特征。我们将新方法应用于在距离隔离模型下模拟的数据集以及三个经验数据集。我们的测试表明,新方法产生了合理的结果,而且不容易出现过度分裂。我们讨论了在分层算法中容纳旁系物种的可能策略,以及基于启发式标准的物种划分所面临的挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Systematic Biology
Systematic Biology 生物-进化生物学
CiteScore
13.00
自引率
7.70%
发文量
70
审稿时长
6-12 weeks
期刊介绍: Systematic Biology is the bimonthly journal of the Society of Systematic Biologists. Papers for the journal are original contributions to the theory, principles, and methods of systematics as well as phylogeny, evolution, morphology, biogeography, paleontology, genetics, and the classification of all living things. A Points of View section offers a forum for discussion, while book reviews and announcements of general interest are also featured.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信