Influence of long-branch bias on phylogenetic analysis.

IF 1.2 4区生物学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY

Genes & genetic systems Pub Date : 2025-03-15 Epub Date: 2025-02-15 DOI:10.1266/ggs.24-00151

Tomoaki Watanabe, Shohei Nakata, Tokumasa Horiike

{"title":"Influence of long-branch bias on phylogenetic analysis.","authors":"Tomoaki Watanabe, Shohei Nakata, Tokumasa Horiike","doi":"10.1266/ggs.24-00151","DOIUrl":null,"url":null,"abstract":"<p><p>In phylogenetic analysis, long-branch attraction (LBA) occurs when two distantly related species with longer branches are mistakenly grouped as the most closely related species. Previous research addressing this issue has focused on phylogenetic trees with four operational taxonomic units and three topologies, using two models: the Felsenstein model tree, which has two long branches that are not closely related, and the Farris tree, which has two long branches that are most closely related. For the Felsenstein model, the maximum parsimony method is more prone to estimating incorrect tree shapes compared to the maximum likelihood (ML) method, whereas in the Farris model, the opposite tendency is observed. However, the underlying reason for these differences remains unclear. Therefore, we inferred phylogenetic trees using sequence data from molecular evolution simulations of model phylogenetic trees with different long-branch lengths and measured the tree shapes and branch lengths of the obtained phylogenetic trees. Our findings revealed that tree inference bias caused by the presence of long branches (defined as 'long-branch bias') increases with the accumulation of mutations, and influences all model trees or phylogenetic inference methods. In other words, in Felsenstein tree models, methods that are highly sensitive to long-branch bias tend to cause LBA, and in Farris tree models, the methods tend to infer apparently correct phylogenetic trees because of this influence. Thus, methods sensitive to long-branch bias always infer the same tree shape. Additionally, long-branch bias causes similar misestimations of branch lengths in both Felsenstein and Farris trees inferred by neighbor-joining or ML. This insight into long-branch bias will lead to a more reliable interpretation of phylogenetic trees, such as the shift of branching points, improving the accuracy of future research in molecular evolution.</p>","PeriodicalId":12690,"journal":{"name":"Genes & genetic systems","volume":" ","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genes & genetic systems","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1266/ggs.24-00151","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/15 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

In phylogenetic analysis, long-branch attraction (LBA) occurs when two distantly related species with longer branches are mistakenly grouped as the most closely related species. Previous research addressing this issue has focused on phylogenetic trees with four operational taxonomic units and three topologies, using two models: the Felsenstein model tree, which has two long branches that are not closely related, and the Farris tree, which has two long branches that are most closely related. For the Felsenstein model, the maximum parsimony method is more prone to estimating incorrect tree shapes compared to the maximum likelihood (ML) method, whereas in the Farris model, the opposite tendency is observed. However, the underlying reason for these differences remains unclear. Therefore, we inferred phylogenetic trees using sequence data from molecular evolution simulations of model phylogenetic trees with different long-branch lengths and measured the tree shapes and branch lengths of the obtained phylogenetic trees. Our findings revealed that tree inference bias caused by the presence of long branches (defined as 'long-branch bias') increases with the accumulation of mutations, and influences all model trees or phylogenetic inference methods. In other words, in Felsenstein tree models, methods that are highly sensitive to long-branch bias tend to cause LBA, and in Farris tree models, the methods tend to infer apparently correct phylogenetic trees because of this influence. Thus, methods sensitive to long-branch bias always infer the same tree shape. Additionally, long-branch bias causes similar misestimations of branch lengths in both Felsenstein and Farris trees inferred by neighbor-joining or ML. This insight into long-branch bias will lead to a more reliable interpretation of phylogenetic trees, such as the shift of branching points, improving the accuracy of future research in molecular evolution.

查看原文本刊更多论文

长枝偏倚对系统发育分析的影响。

在系统发育分析中，当两个具有较长分支的远亲物种被错误地归类为最亲密的物种时，就会发生长分支吸引（LBA）。先前针对这一问题的研究主要集中在具有四个操作分类单位和三种拓扑结构的系统发育树，使用两种模型：Felsenstein模型树，其中有两个不密切相关的长分支，以及Farris树，其中有两个最密切相关的长分支。对于Felsenstein模型，与最大似然（ML）方法相比，最大简约法更容易估计出不正确的树形，而在Farris模型中，观察到相反的趋势。然而，造成这些差异的根本原因尚不清楚。因此，我们利用不同长分支长度的模型系统发生树的分子进化模拟序列数据推断系统发生树，并测量得到的系统发生树的树形和分支长度。我们的研究结果表明，由长分支的存在引起的树推理偏差（定义为“长分支偏差”）随着突变的积累而增加，并影响所有模型树或系统发育推理方法。换句话说，在Felsenstein树模型中，对长分支偏差高度敏感的方法往往会导致LBA，而在Farris树模型中，由于这种影响，这些方法往往会推断出明显正确的系统发育树。因此，对长分支偏差敏感的方法总是推断出相同的树形。此外，在Felsenstein和Farris树中，长分支偏差也会导致类似的分支长度错误估计，这是通过邻居连接或ML推断出来的。这种对长分支偏差的了解将导致对系统发育树的更可靠的解释，例如分支点的转移，从而提高未来分子进化研究的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊