A cubic algorithm for the generalized rank median of three genomes.

IF 1.5 4区 生物学 Q4 BIOCHEMICAL RESEARCH METHODS
Algorithms for Molecular Biology Pub Date : 2019-07-26 eCollection Date: 2019-01-01 DOI:10.1186/s13015-019-0150-y
Leonid Chindelevitch, Sean La, Joao Meidanis
{"title":"A cubic algorithm for the generalized rank median of three genomes.","authors":"Leonid Chindelevitch,&nbsp;Sean La,&nbsp;Joao Meidanis","doi":"10.1186/s13015-019-0150-y","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The area of genome rearrangements has given rise to a number of interesting biological, mathematical and algorithmic problems. Among these, one of the most intractable ones has been that of finding the median of three genomes, a special case of the ancestral reconstruction problem. In this work we re-examine our recently proposed way of measuring genome rearrangement distance, namely, the rank distance between the matrix representations of the corresponding genomes, and show that the median of three genomes can be computed exactly in polynomial time <math><mrow><mi>O</mi> <mo>(</mo> <msup><mi>n</mi> <mi>ω</mi></msup> <mo>)</mo></mrow> </math> , where <math><mrow><mi>ω</mi> <mo>≤</mo> <mn>3</mn></mrow> </math> , with respect to this distance, when the median is allowed to be an arbitrary orthogonal matrix.</p><p><strong>Results: </strong>We define the five fundamental subspaces depending on three input genomes, and use their properties to show that a particular action on each of these subspaces produces a median. In the process we introduce the notion of <i>M</i>-stable subspaces. We also show that the median found by our algorithm is always orthogonal, symmetric, and conserves any adjacencies or telomeres present in at least 2 out of 3 input genomes.</p><p><strong>Conclusions: </strong>We test our method on both simulated and real data. We find that the majority of the realistic inputs result in genomic outputs, and for those that do not, our two heuristics perform well in terms of reconstructing a genomic matrix attaining a score close to the lower bound, while running in a reasonable amount of time. We conclude that the rank distance is not only theoretically intriguing, but also practically useful for median-finding, and potentially ancestral genome reconstruction.</p>","PeriodicalId":50823,"journal":{"name":"Algorithms for Molecular Biology","volume":" ","pages":"16"},"PeriodicalIF":1.5000,"publicationDate":"2019-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13015-019-0150-y","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Algorithms for Molecular Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13015-019-0150-y","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2019/1/1 0:00:00","PubModel":"eCollection","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 2

Abstract

Background: The area of genome rearrangements has given rise to a number of interesting biological, mathematical and algorithmic problems. Among these, one of the most intractable ones has been that of finding the median of three genomes, a special case of the ancestral reconstruction problem. In this work we re-examine our recently proposed way of measuring genome rearrangement distance, namely, the rank distance between the matrix representations of the corresponding genomes, and show that the median of three genomes can be computed exactly in polynomial time O ( n ω ) , where ω 3 , with respect to this distance, when the median is allowed to be an arbitrary orthogonal matrix.

Results: We define the five fundamental subspaces depending on three input genomes, and use their properties to show that a particular action on each of these subspaces produces a median. In the process we introduce the notion of M-stable subspaces. We also show that the median found by our algorithm is always orthogonal, symmetric, and conserves any adjacencies or telomeres present in at least 2 out of 3 input genomes.

Conclusions: We test our method on both simulated and real data. We find that the majority of the realistic inputs result in genomic outputs, and for those that do not, our two heuristics perform well in terms of reconstructing a genomic matrix attaining a score close to the lower bound, while running in a reasonable amount of time. We conclude that the rank distance is not only theoretically intriguing, but also practically useful for median-finding, and potentially ancestral genome reconstruction.

三个基因组广义秩中值的三次算法。
背景:基因组重排领域已经引起了许多有趣的生物学、数学和算法问题。其中,最棘手的问题之一是找到三个基因组的中位数,这是祖先重建问题的一个特例。在这项工作中,我们重新检验了我们最近提出的测量基因组重排距离的方法,即对应基因组的矩阵表示之间的秩距离,并证明了三个基因组的中位数可以在多项式时间O (n ω)内精确计算,其中ω≤3,对于这个距离,当中位数允许是任意正交矩阵时。结果:我们根据三个输入基因组定义了五个基本子空间,并使用它们的特性来显示对每个子空间的特定作用产生中位数。在此过程中,我们引入了m稳定子空间的概念。我们还表明,通过我们的算法发现的中位数总是正交的,对称的,并且在3个输入基因组中至少有2个保存任何邻接或端粒。结论:我们在模拟数据和真实数据上测试了我们的方法。我们发现,大多数现实输入都会产生基因组输出,而对于那些没有的输入,我们的两种启发式方法在重建基因组矩阵方面表现良好,在合理的时间内运行,获得接近下界的分数。我们得出结论,秩距离不仅在理论上有趣,而且在实际中对中位数发现和潜在的祖先基因组重建也很有用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Algorithms for Molecular Biology
Algorithms for Molecular Biology 生物-生化研究方法
CiteScore
2.40
自引率
10.00%
发文量
16
审稿时长
>12 weeks
期刊介绍: Algorithms for Molecular Biology publishes articles on novel algorithms for biological sequence and structure analysis, phylogeny reconstruction, and combinatorial algorithms and machine learning. Areas of interest include but are not limited to: algorithms for RNA and protein structure analysis, gene prediction and genome analysis, comparative sequence analysis and alignment, phylogeny, gene expression, machine learning, and combinatorial algorithms. Where appropriate, manuscripts should describe applications to real-world data. However, pure algorithm papers are also welcome if future applications to biological data are to be expected, or if they address complexity or approximation issues of novel computational problems in molecular biology. Articles about novel software tools will be considered for publication if they contain some algorithmically interesting aspects.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信