On Two Measures of Distance between Fully-Labelled Trees

G. Bernardini, P. Bonizzoni, Paweł Gawrychowski
{"title":"On Two Measures of Distance between Fully-Labelled Trees","authors":"G. Bernardini, P. Bonizzoni, Paweł Gawrychowski","doi":"10.4230/LIPIcs.CPM.2020.6","DOIUrl":null,"url":null,"abstract":"The last decade brought a significant increase in the amount of data and a variety of new inference methods for reconstructing the detailed evolutionary history of various cancers. This brings the need of designing efficient procedures for comparing rooted trees representing the evolution of mutations in tumor phylogenies. Bernardini et al. [CPM 2019] recently introduced a notion of the rearrangement distance for fully-labelled trees motivated by this necessity. This notion originates from two operations: one that permutes the labels of the nodes, the other that affects the topology of the tree. Each operation alone defines a distance that can be computed in polynomial time, while the actual rearrangement distance, that combines the two, was proven to be NP-hard. \nWe answer two open question left unanswered by the previous work. First, what is the complexity of computing the permutation distance? Second, is there a constant-factor approximation algorithm for estimating the rearrangement distance between two arbitrary trees? We answer the first one by showing, via a two-way reduction, that calculating the permutation distance between two trees on $n$ nodes is equivalent, up to polylogarithmic factors, to finding the largest cardinality matching in a sparse bipartite graph. In particular, by plugging in the algorithm of Liu and Sidford [ArXiv 2020], we obtain an $O(n^{4/3+o(1)})$ time algorithm for computing the permutation distance between two trees on $n$ nodes. Then we answer the second question positively, and design a linear-time constant-factor approximation algorithm that does not need any assumption on the trees.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"101 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Symposium on Combinatorial Pattern Matching","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4230/LIPIcs.CPM.2020.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

The last decade brought a significant increase in the amount of data and a variety of new inference methods for reconstructing the detailed evolutionary history of various cancers. This brings the need of designing efficient procedures for comparing rooted trees representing the evolution of mutations in tumor phylogenies. Bernardini et al. [CPM 2019] recently introduced a notion of the rearrangement distance for fully-labelled trees motivated by this necessity. This notion originates from two operations: one that permutes the labels of the nodes, the other that affects the topology of the tree. Each operation alone defines a distance that can be computed in polynomial time, while the actual rearrangement distance, that combines the two, was proven to be NP-hard. We answer two open question left unanswered by the previous work. First, what is the complexity of computing the permutation distance? Second, is there a constant-factor approximation algorithm for estimating the rearrangement distance between two arbitrary trees? We answer the first one by showing, via a two-way reduction, that calculating the permutation distance between two trees on $n$ nodes is equivalent, up to polylogarithmic factors, to finding the largest cardinality matching in a sparse bipartite graph. In particular, by plugging in the algorithm of Liu and Sidford [ArXiv 2020], we obtain an $O(n^{4/3+o(1)})$ time algorithm for computing the permutation distance between two trees on $n$ nodes. Then we answer the second question positively, and design a linear-time constant-factor approximation algorithm that does not need any assumption on the trees.
全标记树间距离的两种度量方法
在过去的十年中,数据量的显著增加和各种新的推断方法用于重建各种癌症的详细进化历史。这就需要设计有效的程序来比较代表肿瘤系统发育中突变进化的根树。Bernardini等人[CPM 2019]最近引入了一种基于这种必要性的完全标记树的重排距离概念。这个概念源于两个操作:一个是排列节点的标签,另一个是影响树的拓扑结构。每个操作单独定义了一个可以在多项式时间内计算的距离,而实际的重排距离,将两者结合起来,被证明是np困难的。我们回答了之前工作没有回答的两个开放性问题。首先,计算排列距离的复杂度是多少?第二,是否存在一种常因子近似算法来估计任意两棵树之间的重排距离?我们通过双向约简来回答第一个问题,即计算两棵树在$n$个节点上的排列距离相当于在一个稀疏的二部图中找到最大的基数匹配,直到多对数因子。特别地,通过插入Liu和Sidford [ArXiv 2020]的算法,我们得到了$O(n^{4/3+ O(1)})$ time算法,用于计算$n$个节点上两棵树之间的排列距离。然后,我们对第二个问题进行了肯定的回答,并设计了一个线性时间常数因子近似算法,该算法不需要对树进行任何假设。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信