wQFM-TREE:高度精确和可扩展的基于四元组的物种树推断。

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Bioinformatics advances Pub Date : 2025-03-13 eCollection Date: 2025-01-01 DOI:10.1093/bioadv/vbaf053
Abdur Rafi, Ahmed Mahir Sultan Rumi, Sheikh Azizul Hakim, Sohaib, Md Toki Tahmid, Rabib Jahin Ibn Momin, Tanjeem Azwad Zaman, Rezwana Reaz, Md Shamsuzzoha Bayzid
{"title":"wQFM-TREE:高度精确和可扩展的基于四元组的物种树推断。","authors":"Abdur Rafi, Ahmed Mahir Sultan Rumi, Sheikh Azizul Hakim, Sohaib, Md Toki Tahmid, Rabib Jahin Ibn Momin, Tanjeem Azwad Zaman, Rezwana Reaz, Md Shamsuzzoha Bayzid","doi":"10.1093/bioadv/vbaf053","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>methods are becoming increasingly popular for species tree estimation from multi-locus data in the presence of gene tree discordance. Accurate Species TRee Algorithm (ASTRAL), a leading method in this class, solves the Maximum Quartet Support Species Tree problem within a constrained solution space, while heuristics like Weighted Quartet Fiduccia-Mattheyses (wQFM) and Weighted Quartet MaxCut (wQMC) use weighted quartets and a divide-and-conquer strategy. Recent studies showed wQFM to be more accurate than ASTRAL and wQMC, though its scalability is hindered by the computational demands of explicitly generating and weighting <math><mrow><mi>Θ</mi> <mo>(</mo> <mrow> <msup><mrow><mi>n</mi></mrow> <mn>4</mn></msup> </mrow> <mo>)</mo></mrow> </math> quartets. Here, we introduce wQFM-TREE, a novel summary method that enhances wQFM by avoiding explicit quartet generation and weighting, enabling its application to large datasets.</p><p><strong>Results: </strong>Extensive simulations under diverse and challenging model conditions, with hundreds or thousands of taxa and genes, consistently demonstrate that wQFM-TREE matches or improves upon the accuracy of ASTRAL. It outperformed ASTRAL in 25 of 27 model conditions (statistically significant in 20) involving 200-1000 taxa. Moreover, applying wQFM-TREE to re-analyze the green plant dataset from the One Thousand Plant Transcriptomes Initiative produced a tree highly congruent with established evolutionary relationships of plants. wQFM-TREE's remarkable accuracy and scalability make it a strong competitor to leading methods. Its algorithmic and combinatorial innovations also enhance quartet-based computations, advancing phylogenetic estimation.</p><p><strong>Availability and implementation: </strong>wQFM-TREE is freely available in open source form at https://github.com/abdur-rafi/wQFM-TREE.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf053"},"PeriodicalIF":2.4000,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11932941/pdf/","citationCount":"0","resultStr":"{\"title\":\"wQFM-TREE: highly accurate and scalable quartet-based species tree inference from gene trees.\",\"authors\":\"Abdur Rafi, Ahmed Mahir Sultan Rumi, Sheikh Azizul Hakim, Sohaib, Md Toki Tahmid, Rabib Jahin Ibn Momin, Tanjeem Azwad Zaman, Rezwana Reaz, Md Shamsuzzoha Bayzid\",\"doi\":\"10.1093/bioadv/vbaf053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Motivation: </strong>methods are becoming increasingly popular for species tree estimation from multi-locus data in the presence of gene tree discordance. Accurate Species TRee Algorithm (ASTRAL), a leading method in this class, solves the Maximum Quartet Support Species Tree problem within a constrained solution space, while heuristics like Weighted Quartet Fiduccia-Mattheyses (wQFM) and Weighted Quartet MaxCut (wQMC) use weighted quartets and a divide-and-conquer strategy. Recent studies showed wQFM to be more accurate than ASTRAL and wQMC, though its scalability is hindered by the computational demands of explicitly generating and weighting <math><mrow><mi>Θ</mi> <mo>(</mo> <mrow> <msup><mrow><mi>n</mi></mrow> <mn>4</mn></msup> </mrow> <mo>)</mo></mrow> </math> quartets. Here, we introduce wQFM-TREE, a novel summary method that enhances wQFM by avoiding explicit quartet generation and weighting, enabling its application to large datasets.</p><p><strong>Results: </strong>Extensive simulations under diverse and challenging model conditions, with hundreds or thousands of taxa and genes, consistently demonstrate that wQFM-TREE matches or improves upon the accuracy of ASTRAL. It outperformed ASTRAL in 25 of 27 model conditions (statistically significant in 20) involving 200-1000 taxa. Moreover, applying wQFM-TREE to re-analyze the green plant dataset from the One Thousand Plant Transcriptomes Initiative produced a tree highly congruent with established evolutionary relationships of plants. wQFM-TREE's remarkable accuracy and scalability make it a strong competitor to leading methods. Its algorithmic and combinatorial innovations also enhance quartet-based computations, advancing phylogenetic estimation.</p><p><strong>Availability and implementation: </strong>wQFM-TREE is freely available in open source form at https://github.com/abdur-rafi/wQFM-TREE.</p>\",\"PeriodicalId\":72368,\"journal\":{\"name\":\"Bioinformatics advances\",\"volume\":\"5 1\",\"pages\":\"vbaf053\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-03-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11932941/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics advances\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioadv/vbaf053\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbaf053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

动机:在存在基因树不一致的情况下,从多位点数据估计物种树的方法越来越流行。精确物种树算法(ASTRAL)是该类中的领先方法,它解决了约束解空间内的最大四重奏支持物种树问题,而加权四重奏fiducia - mattheyses (wQFM)和加权四重奏MaxCut (wQMC)等启发式方法使用加权四重奏和分而治之策略。最近的研究表明,wQFM比ASTRAL和wQMC更准确,尽管其可扩展性受到显式生成和加权Θ (n 4)四重奏的计算需求的阻碍。在这里,我们介绍了一种新的wQFM- tree方法,它通过避免显式的四重奏生成和加权来增强wQFM,使其能够应用于大型数据集。结果:在多样化和具有挑战性的模型条件下,对数百或数千个分类群和基因进行了广泛的模拟,一致表明wQFM-TREE匹配或提高了ASTRAL的准确性。在涉及200-1000个类群的27个模型条件中,它在25个条件下优于ASTRAL(其中20个具有统计学意义)。此外,应用wQFM-TREE重新分析来自一千种植物转录组计划的绿色植物数据集,产生了与已建立的植物进化关系高度一致的树。wQFM-TREE卓越的准确性和可扩展性使其成为领先方法的有力竞争对手。它的算法和组合创新也增强了基于四重奏的计算,推进了系统发育估计。可用性和实现:wQFM-TREE以开放源代码的形式在https://github.com/abdur-rafi/wQFM-TREE上免费提供。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
wQFM-TREE: highly accurate and scalable quartet-based species tree inference from gene trees.

Motivation: methods are becoming increasingly popular for species tree estimation from multi-locus data in the presence of gene tree discordance. Accurate Species TRee Algorithm (ASTRAL), a leading method in this class, solves the Maximum Quartet Support Species Tree problem within a constrained solution space, while heuristics like Weighted Quartet Fiduccia-Mattheyses (wQFM) and Weighted Quartet MaxCut (wQMC) use weighted quartets and a divide-and-conquer strategy. Recent studies showed wQFM to be more accurate than ASTRAL and wQMC, though its scalability is hindered by the computational demands of explicitly generating and weighting Θ ( n 4 ) quartets. Here, we introduce wQFM-TREE, a novel summary method that enhances wQFM by avoiding explicit quartet generation and weighting, enabling its application to large datasets.

Results: Extensive simulations under diverse and challenging model conditions, with hundreds or thousands of taxa and genes, consistently demonstrate that wQFM-TREE matches or improves upon the accuracy of ASTRAL. It outperformed ASTRAL in 25 of 27 model conditions (statistically significant in 20) involving 200-1000 taxa. Moreover, applying wQFM-TREE to re-analyze the green plant dataset from the One Thousand Plant Transcriptomes Initiative produced a tree highly congruent with established evolutionary relationships of plants. wQFM-TREE's remarkable accuracy and scalability make it a strong competitor to leading methods. Its algorithmic and combinatorial innovations also enhance quartet-based computations, advancing phylogenetic estimation.

Availability and implementation: wQFM-TREE is freely available in open source form at https://github.com/abdur-rafi/wQFM-TREE.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.60
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信