基于树宽的网络小简约问题算法。

IF 1.7 4区生物学 Q4 BIOCHEMICAL RESEARCH METHODS

Algorithms for Molecular Biology Pub Date : 2022-08-20 DOI:10.1186/s13015-022-00216-w

Celine Scornavacca, Mathias Weller

{"title":"基于树宽的网络小简约问题算法。","authors":"Celine Scornavacca, Mathias Weller","doi":"10.1186/s13015-022-00216-w","DOIUrl":null,"url":null,"abstract":"Background: Phylogenetic reconstruction is one of the paramount challenges of contemporary bioinformatics. A subtask of existing tree reconstruction algorithms is modeled by the SMALL PARSIMONY problem: given a tree T and an assignment of character-states to its leaves, assign states to the internal nodes of T such as to minimize the parsimony score, that is, the number of edges of T connecting nodes with different states. While this problem is polynomial-time solvable on trees, the matter is more complicated if T contains reticulate events such as hybridizations or recombinations, i.e. when T is a network. Indeed, three different versions of the parsimony score on networks have been proposed and each of them is NP-hard to decide. Existing parameterized algorithms focus on combining the number c of possible character-states with the number of reticulate events (per biconnected component).Results: We consider the parameter treewidth t of the underlying undirected graph of the input network, presenting dynamic programming algorithms for (slight generalizations of) all three versions of the parsimony problem on size-n networks running in times [Formula: see text], [Formula: see text], and [Formula: see text], respectively. Our algorithms use a formulation of the treewidth that may facilitate formalizing treewidth-based dynamic programming algorithms on phylogenetic networks for other problems.Conclusions: Our algorithms allow the computation of the three popular parsimony scores, modeling the evolutionary development of a (multistate) character on a given phylogenetic network of low treewidth. Our results subsume and improve previously known algorithm for all three variants. While our results rely on being given a \"good\" tree-decomposition of the input, encouraging theoretical results as well as practical implementations producing them are publicly available. We present a reformulation of tree decompositions in terms of \"agreeing trees\" on the same set of nodes. As this formulation may come more natural to researchers and engineers developing algorithms for phylogenetic networks, we hope to render exploiting the input network's treewidth as parameter more accessible to this audience.","PeriodicalId":50823,"journal":{"name":"Algorithms for Molecular Biology","volume":" ","pages":"15"},"PeriodicalIF":1.7000,"publicationDate":"2022-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9392953/pdf/","citationCount":"3","resultStr":"{\"title\":\"Treewidth-based algorithms for the small parsimony problem on networks.\",\"authors\":\"Celine Scornavacca, Mathias Weller\",\"doi\":\"10.1186/s13015-022-00216-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Phylogenetic reconstruction is one of the paramount challenges of contemporary bioinformatics. A subtask of existing tree reconstruction algorithms is modeled by the SMALL PARSIMONY problem: given a tree T and an assignment of character-states to its leaves, assign states to the internal nodes of T such as to minimize the parsimony score, that is, the number of edges of T connecting nodes with different states. While this problem is polynomial-time solvable on trees, the matter is more complicated if T contains reticulate events such as hybridizations or recombinations, i.e. when T is a network. Indeed, three different versions of the parsimony score on networks have been proposed and each of them is NP-hard to decide. Existing parameterized algorithms focus on combining the number c of possible character-states with the number of reticulate events (per biconnected component).Results: We consider the parameter treewidth t of the underlying undirected graph of the input network, presenting dynamic programming algorithms for (slight generalizations of) all three versions of the parsimony problem on size-n networks running in times [Formula: see text], [Formula: see text], and [Formula: see text], respectively. Our algorithms use a formulation of the treewidth that may facilitate formalizing treewidth-based dynamic programming algorithms on phylogenetic networks for other problems.Conclusions: Our algorithms allow the computation of the three popular parsimony scores, modeling the evolutionary development of a (multistate) character on a given phylogenetic network of low treewidth. Our results subsume and improve previously known algorithm for all three variants. While our results rely on being given a \\\"good\\\" tree-decomposition of the input, encouraging theoretical results as well as practical implementations producing them are publicly available. We present a reformulation of tree decompositions in terms of \\\"agreeing trees\\\" on the same set of nodes. As this formulation may come more natural to researchers and engineers developing algorithms for phylogenetic networks, we hope to render exploiting the input network's treewidth as parameter more accessible to this audience.\",\"PeriodicalId\":50823,\"journal\":{\"name\":\"Algorithms for Molecular Biology\",\"volume\":\" \",\"pages\":\"15\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2022-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9392953/pdf/\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Algorithms for Molecular Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s13015-022-00216-w\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Algorithms for Molecular Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13015-022-00216-w","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 3

摘要

背景:系统发育重建是当代生物信息学最重要的挑战之一。利用SMALL PARSIMONY问题对现有树重建算法的一个子任务进行建模:给定一棵树T，并将特征状态分配给它的叶子，将状态分配给T的内部节点，例如最小化PARSIMONY得分，即T连接不同状态节点的边数。虽然这个问题在树上是多项式时间可解的，但如果T包含网状事件，如杂交或重组，即当T是一个网络时，问题会更加复杂。事实上，已经提出了三种不同版本的网络节俭评分，每一种都是NP-hard难以决定的。现有的参数化算法侧重于将可能的特征状态数c与网状事件数(每个双连接组件)相结合。结果:我们考虑输入网络底层无向图的参数树宽t，分别为运行时间为[公式:见文]、[公式:见文]和[公式:见文]的size-n网络上的所有三个版本的简约性问题提出了动态规划算法(轻微推广)。我们的算法使用树宽度的公式，这可能有助于形式化系统发育网络上基于树宽度的动态规划算法，以解决其他问题。结论:我们的算法允许计算三种流行的节俭分数，在给定的低树宽系统发育网络上模拟(多状态)特征的进化发展。我们的结果包含并改进了所有三种变体的已知算法。虽然我们的结果依赖于给定输入的“良好”树分解，但令人鼓舞的理论结果以及产生它们的实际实现都是公开可用的。我们提出了一种基于相同节点集上的“同意树”的树分解的重新表述。由于这个公式对于开发系统发育网络算法的研究人员和工程师来说可能更自然，我们希望能够让这些观众更容易地利用输入网络的树宽作为参数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Treewidth-based algorithms for the small parsimony problem on networks.

查看原文本刊更多论文

Treewidth-based algorithms for the small parsimony problem on networks.

Background: Phylogenetic reconstruction is one of the paramount challenges of contemporary bioinformatics. A subtask of existing tree reconstruction algorithms is modeled by the SMALL PARSIMONY problem: given a tree T and an assignment of character-states to its leaves, assign states to the internal nodes of T such as to minimize the parsimony score, that is, the number of edges of T connecting nodes with different states. While this problem is polynomial-time solvable on trees, the matter is more complicated if T contains reticulate events such as hybridizations or recombinations, i.e. when T is a network. Indeed, three different versions of the parsimony score on networks have been proposed and each of them is NP-hard to decide. Existing parameterized algorithms focus on combining the number c of possible character-states with the number of reticulate events (per biconnected component).

Results: We consider the parameter treewidth t of the underlying undirected graph of the input network, presenting dynamic programming algorithms for (slight generalizations of) all three versions of the parsimony problem on size-n networks running in times [Formula: see text], [Formula: see text], and [Formula: see text], respectively. Our algorithms use a formulation of the treewidth that may facilitate formalizing treewidth-based dynamic programming algorithms on phylogenetic networks for other problems.

Conclusions: Our algorithms allow the computation of the three popular parsimony scores, modeling the evolutionary development of a (multistate) character on a given phylogenetic network of low treewidth. Our results subsume and improve previously known algorithm for all three variants. While our results rely on being given a "good" tree-decomposition of the input, encouraging theoretical results as well as practical implementations producing them are publicly available. We present a reformulation of tree decompositions in terms of "agreeing trees" on the same set of nodes. As this formulation may come more natural to researchers and engineers developing algorithms for phylogenetic networks, we hope to render exploiting the input network's treewidth as parameter more accessible to this audience.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Algorithms for Molecular Biology 生物-生化研究方法

CiteScore

2.40

自引率

10.00%

发文量

审稿时长

>12 weeks

期刊介绍： Algorithms for Molecular Biology publishes articles on novel algorithms for biological sequence and structure analysis, phylogeny reconstruction, and combinatorial algorithms and machine learning. Areas of interest include but are not limited to: algorithms for RNA and protein structure analysis, gene prediction and genome analysis, comparative sequence analysis and alignment, phylogeny, gene expression, machine learning, and combinatorial algorithms. Where appropriate, manuscripts should describe applications to real-world data. However, pure algorithm papers are also welcome if future applications to biological data are to be expected, or if they address complexity or approximation issues of novel computational problems in molecular biology. Articles about novel software tools will be considered for publication if they contain some algorithmically interesting aspects.