{"title":"系统发育树NNI邻域中少量替换数据最大简约性的正确性","authors":"Mareike Fischer","doi":"10.1007/s00026-024-00725-y","DOIUrl":null,"url":null,"abstract":"<div><p>Estimating phylogenetic trees, which depict the relationships between different species, from aligned sequence data (such as DNA, RNA, or proteins) is one of the main aims of evolutionary biology. However, tree reconstruction criteria like maximum parsimony do not necessarily lead to unique trees and in some cases even fail to recognize the “correct” tree (i.e., the tree on which the data was generated). On the other hand, a recent study has shown that for an alignment containing precisely those binary characters (sites) which require up to two substitutions on a given tree, this tree will be the unique maximum parsimony tree. It is the aim of the present paper to generalize this recent result in the following sense: We show that for a tree <i>T</i> with <i>n</i> leaves, as long as <span>\\(k<\\frac{n}{8}+\\frac{11}{9}-\\frac{1}{18}\\sqrt{9\\cdot \\left( \\frac{n}{4}\\right) ^2+16}\\)</span> (or, equivalently, <span>\\(n>9k-11+\\sqrt{9k^2-22k+17}\\)</span>, which in particular holds for all <span>\\(n\\ge 12k\\)</span>), the maximum parsimony tree for the alignment containing all binary characters which require (up to or precisely) <i>k</i> substitutions on <i>T</i> will be unique in the NNI neighborhood of <i>T</i> and it will coincide with <i>T</i>, too. In other words, within the NNI neighborhood of <i>T</i>, <i>T</i> is the unique most parsimonious tree for the said alignment. This partially answers a recently published conjecture affirmatively. Additionally, we show that for <span>\\(n\\ge 8\\)</span> and for <i>k</i> being in the order of <span>\\(\\frac{n}{2}\\)</span>, there is always a pair of phylogenetic trees <i>T</i> and <span>\\(T'\\)</span> which are NNI neighbors, but for which the alignment of characters requiring precisely <i>k</i> substitutions each on <i>T</i> in total requires fewer substitutions on <span>\\(T'\\)</span>.</p></div>","PeriodicalId":50769,"journal":{"name":"Annals of Combinatorics","volume":"29 2","pages":"615 - 635"},"PeriodicalIF":0.7000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s00026-024-00725-y.pdf","citationCount":"0","resultStr":"{\"title\":\"On the Correctness of Maximum Parsimony for Data with Few Substitutions in the NNI Neighborhood of Phylogenetic Trees\",\"authors\":\"Mareike Fischer\",\"doi\":\"10.1007/s00026-024-00725-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Estimating phylogenetic trees, which depict the relationships between different species, from aligned sequence data (such as DNA, RNA, or proteins) is one of the main aims of evolutionary biology. However, tree reconstruction criteria like maximum parsimony do not necessarily lead to unique trees and in some cases even fail to recognize the “correct” tree (i.e., the tree on which the data was generated). On the other hand, a recent study has shown that for an alignment containing precisely those binary characters (sites) which require up to two substitutions on a given tree, this tree will be the unique maximum parsimony tree. It is the aim of the present paper to generalize this recent result in the following sense: We show that for a tree <i>T</i> with <i>n</i> leaves, as long as <span>\\\\(k<\\\\frac{n}{8}+\\\\frac{11}{9}-\\\\frac{1}{18}\\\\sqrt{9\\\\cdot \\\\left( \\\\frac{n}{4}\\\\right) ^2+16}\\\\)</span> (or, equivalently, <span>\\\\(n>9k-11+\\\\sqrt{9k^2-22k+17}\\\\)</span>, which in particular holds for all <span>\\\\(n\\\\ge 12k\\\\)</span>), the maximum parsimony tree for the alignment containing all binary characters which require (up to or precisely) <i>k</i> substitutions on <i>T</i> will be unique in the NNI neighborhood of <i>T</i> and it will coincide with <i>T</i>, too. In other words, within the NNI neighborhood of <i>T</i>, <i>T</i> is the unique most parsimonious tree for the said alignment. This partially answers a recently published conjecture affirmatively. Additionally, we show that for <span>\\\\(n\\\\ge 8\\\\)</span> and for <i>k</i> being in the order of <span>\\\\(\\\\frac{n}{2}\\\\)</span>, there is always a pair of phylogenetic trees <i>T</i> and <span>\\\\(T'\\\\)</span> which are NNI neighbors, but for which the alignment of characters requiring precisely <i>k</i> substitutions each on <i>T</i> in total requires fewer substitutions on <span>\\\\(T'\\\\)</span>.</p></div>\",\"PeriodicalId\":50769,\"journal\":{\"name\":\"Annals of Combinatorics\",\"volume\":\"29 2\",\"pages\":\"615 - 635\"},\"PeriodicalIF\":0.7000,\"publicationDate\":\"2024-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s00026-024-00725-y.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Combinatorics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s00026-024-00725-y\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Combinatorics","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1007/s00026-024-00725-y","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
On the Correctness of Maximum Parsimony for Data with Few Substitutions in the NNI Neighborhood of Phylogenetic Trees
Estimating phylogenetic trees, which depict the relationships between different species, from aligned sequence data (such as DNA, RNA, or proteins) is one of the main aims of evolutionary biology. However, tree reconstruction criteria like maximum parsimony do not necessarily lead to unique trees and in some cases even fail to recognize the “correct” tree (i.e., the tree on which the data was generated). On the other hand, a recent study has shown that for an alignment containing precisely those binary characters (sites) which require up to two substitutions on a given tree, this tree will be the unique maximum parsimony tree. It is the aim of the present paper to generalize this recent result in the following sense: We show that for a tree T with n leaves, as long as \(k<\frac{n}{8}+\frac{11}{9}-\frac{1}{18}\sqrt{9\cdot \left( \frac{n}{4}\right) ^2+16}\) (or, equivalently, \(n>9k-11+\sqrt{9k^2-22k+17}\), which in particular holds for all \(n\ge 12k\)), the maximum parsimony tree for the alignment containing all binary characters which require (up to or precisely) k substitutions on T will be unique in the NNI neighborhood of T and it will coincide with T, too. In other words, within the NNI neighborhood of T, T is the unique most parsimonious tree for the said alignment. This partially answers a recently published conjecture affirmatively. Additionally, we show that for \(n\ge 8\) and for k being in the order of \(\frac{n}{2}\), there is always a pair of phylogenetic trees T and \(T'\) which are NNI neighbors, but for which the alignment of characters requiring precisely k substitutions each on T in total requires fewer substitutions on \(T'\).
期刊介绍:
Annals of Combinatorics publishes outstanding contributions to combinatorics with a particular focus on algebraic and analytic combinatorics, as well as the areas of graph and matroid theory. Special regard will be given to new developments and topics of current interest to the community represented by our editorial board.
The scope of Annals of Combinatorics is covered by the following three tracks:
Algebraic Combinatorics:
Enumerative combinatorics, symmetric functions, Schubert calculus / Combinatorial Hopf algebras, cluster algebras, Lie algebras, root systems, Coxeter groups / Discrete geometry, tropical geometry / Discrete dynamical systems / Posets and lattices
Analytic and Algorithmic Combinatorics:
Asymptotic analysis of counting sequences / Bijective combinatorics / Univariate and multivariable singularity analysis / Combinatorics and differential equations / Resolution of hard combinatorial problems by making essential use of computers / Advanced methods for evaluating counting sequences or combinatorial constants / Complexity and decidability aspects of combinatorial sequences / Combinatorial aspects of the analysis of algorithms
Graphs and Matroids:
Structural graph theory, graph minors, graph sparsity, decompositions and colorings / Planar graphs and topological graph theory, geometric representations of graphs / Directed graphs, posets / Metric graph theory / Spectral and algebraic graph theory / Random graphs, extremal graph theory / Matroids, oriented matroids, matroid minors / Algorithmic approaches