{"title":"Bounding the Softwired Parsimony Score of a Phylogenetic Network.","authors":"Janosch Döcker, Simone Linz, Kristina Wicke","doi":"10.1007/s11538-024-01350-9","DOIUrl":null,"url":null,"abstract":"<p><p>In comparison to phylogenetic trees, phylogenetic networks are more suitable to represent complex evolutionary histories of species whose past includes reticulation such as hybridisation or lateral gene transfer. However, the reconstruction of phylogenetic networks remains challenging and computationally expensive due to their intricate structural properties. For example, the small parsimony problem that is solvable in polynomial time for phylogenetic trees, becomes NP-hard on phylogenetic networks under softwired and parental parsimony, even for a single binary character and structurally constrained networks. To calculate the parsimony score of a phylogenetic network N, these two parsimony notions consider different exponential-size sets of phylogenetic trees that can be extracted from N and infer the minimum parsimony score over all trees in the set. In this paper, we ask: What is the maximum difference between the parsimony score of any phylogenetic tree that is contained in the set of considered trees and a phylogenetic tree whose parsimony score equates to the parsimony score of N? Given a gap-free sequence alignment of multi-state characters and a rooted binary level-k phylogenetic network, we use the novel concept of an informative blob to show that this difference is bounded by <math><mrow><mi>k</mi> <mo>+</mo> <mn>1</mn></mrow> </math> times the softwired parsimony score of N. In particular, the difference is independent of the alignment length and the number of character states. We show that an analogous bound can be obtained for the softwired parsimony score of semi-directed networks, while under parental parsimony on the other hand, such a bound does not hold.</p>","PeriodicalId":9372,"journal":{"name":"Bulletin of Mathematical Biology","volume":"86 10","pages":"121"},"PeriodicalIF":2.0000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11341636/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bulletin of Mathematical Biology","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s11538-024-01350-9","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
In comparison to phylogenetic trees, phylogenetic networks are more suitable to represent complex evolutionary histories of species whose past includes reticulation such as hybridisation or lateral gene transfer. However, the reconstruction of phylogenetic networks remains challenging and computationally expensive due to their intricate structural properties. For example, the small parsimony problem that is solvable in polynomial time for phylogenetic trees, becomes NP-hard on phylogenetic networks under softwired and parental parsimony, even for a single binary character and structurally constrained networks. To calculate the parsimony score of a phylogenetic network N, these two parsimony notions consider different exponential-size sets of phylogenetic trees that can be extracted from N and infer the minimum parsimony score over all trees in the set. In this paper, we ask: What is the maximum difference between the parsimony score of any phylogenetic tree that is contained in the set of considered trees and a phylogenetic tree whose parsimony score equates to the parsimony score of N? Given a gap-free sequence alignment of multi-state characters and a rooted binary level-k phylogenetic network, we use the novel concept of an informative blob to show that this difference is bounded by times the softwired parsimony score of N. In particular, the difference is independent of the alignment length and the number of character states. We show that an analogous bound can be obtained for the softwired parsimony score of semi-directed networks, while under parental parsimony on the other hand, such a bound does not hold.
与系统发生树相比,系统发生网络更适合表示物种的复杂进化史,因为物种的过去包括网状结构,如杂交或横向基因转移。然而,由于系统发育网络错综复杂的结构特性,重建系统发育网络仍然具有挑战性,而且计算成本高昂。例如,对于系统发育树来说可以在多项式时间内求解的小解析问题,在软线解析和亲代解析条件下,即使是对于单一二元特征和结构受限的网络,对于系统发育网络来说也变得NP-困难。为了计算一个系统发生网络 N 的解析得分,这两种解析概念考虑了可以从 N 中提取的不同指数大小的系统发生树集合,并推断出集合中所有树的最小解析得分。在本文中,我们要问:包含在所考虑的树集中的任何系统发生树的解析得分与解析得分等于 N 的系统发生树的解析得分之间的最大差异是多少?给定一个多状态特征的无间隙序列比对和一个有根的二元水平-k 系统发育网络,我们使用信息球(informative blob)这一新颖概念来证明这一差异的界限是 N 的软线解析得分的 k + 1 倍。我们证明,半定向网络的软线解析得分也可以得到类似的约束,而另一方面,在亲本解析下,这样的约束并不成立。
期刊介绍:
The Bulletin of Mathematical Biology, the official journal of the Society for Mathematical Biology, disseminates original research findings and other information relevant to the interface of biology and the mathematical sciences. Contributions should have relevance to both fields. In order to accommodate the broad scope of new developments, the journal accepts a variety of contributions, including:
Original research articles focused on new biological insights gained with the help of tools from the mathematical sciences or new mathematical tools and methods with demonstrated applicability to biological investigations
Research in mathematical biology education
Reviews
Commentaries
Perspectives, and contributions that discuss issues important to the profession
All contributions are peer-reviewed.