Signed rearrangement distances considering repeated genes, intergenic regions, and indels

IF 0.9 4区 数学 Q4 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Gabriel Siqueira, Alexsandro Oliveira Alexandrino, Zanoni Dias
{"title":"Signed rearrangement distances considering repeated genes, intergenic regions, and indels","authors":"Gabriel Siqueira, Alexsandro Oliveira Alexandrino, Zanoni Dias","doi":"10.1007/s10878-023-01083-w","DOIUrl":null,"url":null,"abstract":"<p>Genome rearrangement distance problems allow to estimate the evolutionary distance between genomes. These problems aim to compute the minimum number of mutations called rearrangement events necessary to transform one genome into another. Two commonly studied rearrangements are the reversal, which inverts a sequence of genes, and the transposition, which exchanges two consecutive sequences of genes. Seminal works on this topic focused on the sequence of genes and assumed that each gene occurs exactly once on each genome. More realistic models have been assuming that a gene may have multiple copies or may appear in only one of the genomes. Other models also take into account the nucleotides between consecutive pairs of genes, which are called intergenic regions. This work combines all these generalizations defining the signed intergenic reversal distance (SIRD), the signed intergenic reversal and transposition distance (SIRTD), the signed intergenic reversal and indels distance (SIRID), and the signed intergenic reversal, transposition, and indels distance (SIRTID) problems. We show a relation between these problems and the signed minimum common intergenic string partition (SMCISP) problem. From such relation, we derive <span>\\(\\varTheta (k)\\)</span>-approximation algorithms for the SIRD and the SIRTD problems, where <i>k</i> is maximum number of copies of a gene in the genomes. These algorithms also work as heuristics for the SIRID and SIRTID problems. Additionally, we present some parametrized algorithms for SMCISP that ensure constant approximation factors for the distance problems. Our experimental tests on simulated genomes show an improvement on the rearrangement distances with the use of the partition algorithms.\n</p>","PeriodicalId":50231,"journal":{"name":"Journal of Combinatorial Optimization","volume":null,"pages":null},"PeriodicalIF":0.9000,"publicationDate":"2023-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Combinatorial Optimization","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s10878-023-01083-w","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Genome rearrangement distance problems allow to estimate the evolutionary distance between genomes. These problems aim to compute the minimum number of mutations called rearrangement events necessary to transform one genome into another. Two commonly studied rearrangements are the reversal, which inverts a sequence of genes, and the transposition, which exchanges two consecutive sequences of genes. Seminal works on this topic focused on the sequence of genes and assumed that each gene occurs exactly once on each genome. More realistic models have been assuming that a gene may have multiple copies or may appear in only one of the genomes. Other models also take into account the nucleotides between consecutive pairs of genes, which are called intergenic regions. This work combines all these generalizations defining the signed intergenic reversal distance (SIRD), the signed intergenic reversal and transposition distance (SIRTD), the signed intergenic reversal and indels distance (SIRID), and the signed intergenic reversal, transposition, and indels distance (SIRTID) problems. We show a relation between these problems and the signed minimum common intergenic string partition (SMCISP) problem. From such relation, we derive \(\varTheta (k)\)-approximation algorithms for the SIRD and the SIRTD problems, where k is maximum number of copies of a gene in the genomes. These algorithms also work as heuristics for the SIRID and SIRTID problems. Additionally, we present some parametrized algorithms for SMCISP that ensure constant approximation factors for the distance problems. Our experimental tests on simulated genomes show an improvement on the rearrangement distances with the use of the partition algorithms.

Abstract Image

考虑重复基因、基因间区域和indel的符号重排距离
基因组重排距离问题允许估计基因组之间的进化距离。这些问题的目的是计算将一个基因组转化为另一个基因组所需的被称为重排事件的最小突变数量。两种通常研究的重排是反转,它反转一个基因序列,以及换位,它交换两个连续的基因序列。关于这个主题的研讨会工作集中在基因序列上,并假设每个基因在每个基因组上只出现一次。更现实的模型一直假设一个基因可能有多个拷贝,或者可能只出现在一个基因组中。其他模型也考虑了连续基因对之间的核苷酸,称为基因间区域。这项工作结合了所有这些定义有符号基因间反转距离(SIRD)、有符号基因内反转和换位距离(SIRTD)、有标记基因间反转和indels距离(SIRID)以及有符号基因外反转、换位和indels-距离(SIRTID)问题的推广。我们展示了这些问题与有符号最小公共基因间字符串分区(SMCISP)问题之间的关系。根据这种关系,我们导出了SIRD和SIRTD问题的\(\varTheta(k)\)-近似算法,其中k是基因组中基因的最大拷贝数。这些算法也可以作为SIRID和SIRTID问题的启发式算法。此外,我们还提出了一些SMCISP的参数化算法,以确保距离问题的近似因子不变。我们在模拟基因组上的实验测试表明,使用分割算法可以改善重排距离。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Combinatorial Optimization
Journal of Combinatorial Optimization 数学-计算机:跨学科应用
CiteScore
2.00
自引率
10.00%
发文量
83
审稿时长
6 months
期刊介绍: The objective of Journal of Combinatorial Optimization is to advance and promote the theory and applications of combinatorial optimization, which is an area of research at the intersection of applied mathematics, computer science, and operations research and which overlaps with many other areas such as computation complexity, computational biology, VLSI design, communication networks, and management science. It includes complexity analysis and algorithm design for combinatorial optimization problems, numerical experiments and problem discovery with applications in science and engineering. The Journal of Combinatorial Optimization publishes refereed papers dealing with all theoretical, computational and applied aspects of combinatorial optimization. It also publishes reviews of appropriate books and special issues of journals.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信