On the Correctness of Maximum Parsimony for Data with Few Substitutions in the NNI Neighborhood of Phylogenetic Trees

IF 0.7 4区 数学 Q4 MATHEMATICS, APPLIED
Mareike Fischer
{"title":"On the Correctness of Maximum Parsimony for Data with Few Substitutions in the NNI Neighborhood of Phylogenetic Trees","authors":"Mareike Fischer","doi":"10.1007/s00026-024-00725-y","DOIUrl":null,"url":null,"abstract":"<div><p>Estimating phylogenetic trees, which depict the relationships between different species, from aligned sequence data (such as DNA, RNA, or proteins) is one of the main aims of evolutionary biology. However, tree reconstruction criteria like maximum parsimony do not necessarily lead to unique trees and in some cases even fail to recognize the “correct” tree (i.e., the tree on which the data was generated). On the other hand, a recent study has shown that for an alignment containing precisely those binary characters (sites) which require up to two substitutions on a given tree, this tree will be the unique maximum parsimony tree. It is the aim of the present paper to generalize this recent result in the following sense: We show that for a tree <i>T</i> with <i>n</i> leaves, as long as <span>\\(k&lt;\\frac{n}{8}+\\frac{11}{9}-\\frac{1}{18}\\sqrt{9\\cdot \\left( \\frac{n}{4}\\right) ^2+16}\\)</span> (or, equivalently, <span>\\(n&gt;9k-11+\\sqrt{9k^2-22k+17}\\)</span>, which in particular holds for all <span>\\(n\\ge 12k\\)</span>), the maximum parsimony tree for the alignment containing all binary characters which require (up to or precisely) <i>k</i> substitutions on <i>T</i> will be unique in the NNI neighborhood of <i>T</i> and it will coincide with <i>T</i>, too. In other words, within the NNI neighborhood of <i>T</i>, <i>T</i> is the unique most parsimonious tree for the said alignment. This partially answers a recently published conjecture affirmatively. Additionally, we show that for <span>\\(n\\ge 8\\)</span> and for <i>k</i> being in the order of <span>\\(\\frac{n}{2}\\)</span>, there is always a pair of phylogenetic trees <i>T</i> and <span>\\(T'\\)</span> which are NNI neighbors, but for which the alignment of characters requiring precisely <i>k</i> substitutions each on <i>T</i> in total requires fewer substitutions on <span>\\(T'\\)</span>.</p></div>","PeriodicalId":50769,"journal":{"name":"Annals of Combinatorics","volume":"29 2","pages":"615 - 635"},"PeriodicalIF":0.7000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s00026-024-00725-y.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Combinatorics","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1007/s00026-024-00725-y","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0

Abstract

Estimating phylogenetic trees, which depict the relationships between different species, from aligned sequence data (such as DNA, RNA, or proteins) is one of the main aims of evolutionary biology. However, tree reconstruction criteria like maximum parsimony do not necessarily lead to unique trees and in some cases even fail to recognize the “correct” tree (i.e., the tree on which the data was generated). On the other hand, a recent study has shown that for an alignment containing precisely those binary characters (sites) which require up to two substitutions on a given tree, this tree will be the unique maximum parsimony tree. It is the aim of the present paper to generalize this recent result in the following sense: We show that for a tree T with n leaves, as long as \(k<\frac{n}{8}+\frac{11}{9}-\frac{1}{18}\sqrt{9\cdot \left( \frac{n}{4}\right) ^2+16}\) (or, equivalently, \(n>9k-11+\sqrt{9k^2-22k+17}\), which in particular holds for all \(n\ge 12k\)), the maximum parsimony tree for the alignment containing all binary characters which require (up to or precisely) k substitutions on T will be unique in the NNI neighborhood of T and it will coincide with T, too. In other words, within the NNI neighborhood of T, T is the unique most parsimonious tree for the said alignment. This partially answers a recently published conjecture affirmatively. Additionally, we show that for \(n\ge 8\) and for k being in the order of \(\frac{n}{2}\), there is always a pair of phylogenetic trees T and \(T'\) which are NNI neighbors, but for which the alignment of characters requiring precisely k substitutions each on T in total requires fewer substitutions on \(T'\).

系统发育树NNI邻域中少量替换数据最大简约性的正确性
从排列的序列数据(如DNA、RNA或蛋白质)估计描述不同物种之间关系的系统发育树是进化生物学的主要目标之一。然而,像最大简约这样的树重建标准并不一定会产生唯一的树,在某些情况下甚至无法识别“正确”的树(即生成数据的树)。另一方面,最近的一项研究表明,对于精确包含那些二进制字符(位点)的比对,在给定树上需要最多两次替换,该树将是唯一的最大简约树。本文的目的是在以下意义上推广这个最近的结果:我们表明,对于具有n个叶子的树T,只要\(k<\frac{n}{8}+\frac{11}{9}-\frac{1}{18}\sqrt{9\cdot \left( \frac{n}{4}\right) ^2+16}\)(或者,等效地,\(n>9k-11+\sqrt{9k^2-22k+17}\),特别适用于所有\(n\ge 12k\)),包含所有需要(最多或精确地)k个替换T的二进制字符的对齐的最大简约树在T的NNI邻域中是唯一的,并且它也将与T重合。换句话说,在T的NNI邻域内,T是上述对齐的唯一最节俭树。这部分肯定地回答了最近发表的一个猜想。此外,我们表明,对于\(n\ge 8\)和k在\(\frac{n}{2}\)的顺序,总是有一对系统发育树T和\(T'\)是NNI邻居,但是对于它们来说,在T上需要精确的k个替换的字符对齐在\(T'\)上需要更少的替换。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Annals of Combinatorics
Annals of Combinatorics 数学-应用数学
CiteScore
1.00
自引率
0.00%
发文量
56
审稿时长
>12 weeks
期刊介绍: Annals of Combinatorics publishes outstanding contributions to combinatorics with a particular focus on algebraic and analytic combinatorics, as well as the areas of graph and matroid theory. Special regard will be given to new developments and topics of current interest to the community represented by our editorial board. The scope of Annals of Combinatorics is covered by the following three tracks: Algebraic Combinatorics: Enumerative combinatorics, symmetric functions, Schubert calculus / Combinatorial Hopf algebras, cluster algebras, Lie algebras, root systems, Coxeter groups / Discrete geometry, tropical geometry / Discrete dynamical systems / Posets and lattices Analytic and Algorithmic Combinatorics: Asymptotic analysis of counting sequences / Bijective combinatorics / Univariate and multivariable singularity analysis / Combinatorics and differential equations / Resolution of hard combinatorial problems by making essential use of computers / Advanced methods for evaluating counting sequences or combinatorial constants / Complexity and decidability aspects of combinatorial sequences / Combinatorial aspects of the analysis of algorithms Graphs and Matroids: Structural graph theory, graph minors, graph sparsity, decompositions and colorings / Planar graphs and topological graph theory, geometric representations of graphs / Directed graphs, posets / Metric graph theory / Spectral and algebraic graph theory / Random graphs, extremal graph theory / Matroids, oriented matroids, matroid minors / Algorithmic approaches
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信