{"title":"Influence of long-branch bias on phylogenetic analysis.","authors":"Tomoaki Watanabe, Shohei Nakata, Tokumasa Horiike","doi":"10.1266/ggs.24-00151","DOIUrl":null,"url":null,"abstract":"<p><p>In phylogenetic analysis, long-branch attraction (LBA) occurs when two distantly related species with longer branches are mistakenly grouped as the most closely related species. Previous research addressing this issue has focused on phylogenetic trees with four operational taxonomic units and three topologies, using two models: the Felsenstein model tree, which has two long branches that are not closely related, and the Farris tree, which has two long branches that are most closely related. For the Felsenstein model, the maximum parsimony method is more prone to estimating incorrect tree shapes compared to the maximum likelihood (ML) method, whereas in the Farris model, the opposite tendency is observed. However, the underlying reason for these differences remains unclear. Therefore, we inferred phylogenetic trees using sequence data from molecular evolution simulations of model phylogenetic trees with different long-branch lengths and measured the tree shapes and branch lengths of the obtained phylogenetic trees. Our findings revealed that tree inference bias caused by the presence of long branches (defined as 'long-branch bias') increases with the accumulation of mutations, and influences all model trees or phylogenetic inference methods. In other words, in Felsenstein tree models, methods that are highly sensitive to long-branch bias tend to cause LBA, and in Farris tree models, the methods tend to infer apparently correct phylogenetic trees because of this influence. Thus, methods sensitive to long-branch bias always infer the same tree shape. Additionally, long-branch bias causes similar misestimations of branch lengths in both Felsenstein and Farris trees inferred by neighbor-joining or ML. This insight into long-branch bias will lead to a more reliable interpretation of phylogenetic trees, such as the shift of branching points, improving the accuracy of future research in molecular evolution.</p>","PeriodicalId":12690,"journal":{"name":"Genes & genetic systems","volume":" ","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genes & genetic systems","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1266/ggs.24-00151","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/15 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
In phylogenetic analysis, long-branch attraction (LBA) occurs when two distantly related species with longer branches are mistakenly grouped as the most closely related species. Previous research addressing this issue has focused on phylogenetic trees with four operational taxonomic units and three topologies, using two models: the Felsenstein model tree, which has two long branches that are not closely related, and the Farris tree, which has two long branches that are most closely related. For the Felsenstein model, the maximum parsimony method is more prone to estimating incorrect tree shapes compared to the maximum likelihood (ML) method, whereas in the Farris model, the opposite tendency is observed. However, the underlying reason for these differences remains unclear. Therefore, we inferred phylogenetic trees using sequence data from molecular evolution simulations of model phylogenetic trees with different long-branch lengths and measured the tree shapes and branch lengths of the obtained phylogenetic trees. Our findings revealed that tree inference bias caused by the presence of long branches (defined as 'long-branch bias') increases with the accumulation of mutations, and influences all model trees or phylogenetic inference methods. In other words, in Felsenstein tree models, methods that are highly sensitive to long-branch bias tend to cause LBA, and in Farris tree models, the methods tend to infer apparently correct phylogenetic trees because of this influence. Thus, methods sensitive to long-branch bias always infer the same tree shape. Additionally, long-branch bias causes similar misestimations of branch lengths in both Felsenstein and Farris trees inferred by neighbor-joining or ML. This insight into long-branch bias will lead to a more reliable interpretation of phylogenetic trees, such as the shift of branching points, improving the accuracy of future research in molecular evolution.