Assignment of orthologous genes in unbalanced genomes using cycle packing of adjacency graphs

IF 1.1 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Gabriel Siqueira, Andre Rodrigues Oliveira, Alexsandro Oliveira Alexandrino, Géraldine Jean, Guillaume Fertin, Zanoni Dias
{"title":"Assignment of orthologous genes in unbalanced genomes using cycle packing of adjacency graphs","authors":"Gabriel Siqueira, Andre Rodrigues Oliveira, Alexsandro Oliveira Alexandrino, Géraldine Jean, Guillaume Fertin, Zanoni Dias","doi":"10.1007/s10732-024-09528-z","DOIUrl":null,"url":null,"abstract":"<p>The adjacency graph is a structure used to model genomes in several rearrangement distance problems. In particular, most studies use properties of a maximum cycle packing of this graph to develop bounds and algorithms for rearrangement distance problems, such as the reversal distance, the reversal and transposition distance, and the double cut and join distance. When each genome has no repeated genes, there exists only one cycle packing for the graph. However, when each genome may have repeated genes, the problem of finding a maximum cycle packing for the adjacency graph (adjacency graph packing) is NP-hard. In this work, we develop a randomized greedy heuristic and a genetic algorithm heuristic for the adjacency graph packing problem for genomes with repeated genes and unequal gene content. We also propose new algorithms with simple implementation and good practical performance for reversal distance and reversal and transposition distance in genomes without repeated genes, which we combine with the heuristics to find solutions for the problems with repeated genes. We present experimental results and compare the application of these heuristics with the application of the MSOAR framework in rearrangement distance problems. Lastly, we apply our genetic algorithm heuristic to real genomic data to validate its practical use.</p>","PeriodicalId":54810,"journal":{"name":"Journal of Heuristics","volume":"229 1","pages":""},"PeriodicalIF":1.1000,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Heuristics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10732-024-09528-z","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

The adjacency graph is a structure used to model genomes in several rearrangement distance problems. In particular, most studies use properties of a maximum cycle packing of this graph to develop bounds and algorithms for rearrangement distance problems, such as the reversal distance, the reversal and transposition distance, and the double cut and join distance. When each genome has no repeated genes, there exists only one cycle packing for the graph. However, when each genome may have repeated genes, the problem of finding a maximum cycle packing for the adjacency graph (adjacency graph packing) is NP-hard. In this work, we develop a randomized greedy heuristic and a genetic algorithm heuristic for the adjacency graph packing problem for genomes with repeated genes and unequal gene content. We also propose new algorithms with simple implementation and good practical performance for reversal distance and reversal and transposition distance in genomes without repeated genes, which we combine with the heuristics to find solutions for the problems with repeated genes. We present experimental results and compare the application of these heuristics with the application of the MSOAR framework in rearrangement distance problems. Lastly, we apply our genetic algorithm heuristic to real genomic data to validate its practical use.

Abstract Image

利用邻接图的循环包装分配不平衡基因组中的同源基因
邻接图是几种重排距离问题中用来模拟基因组的结构。特别是,大多数研究利用该图的最大循环包装的特性来开发重排距离问题的边界和算法,如反转距离、反转和换位距离以及双切和连接距离。当每个基因组没有重复基因时,该图只存在一个循环包装。然而,当每个基因组可能有重复基因时,为邻接图寻找最大循环包装(邻接图包装)的问题是 NP-困难的。在这项研究中,我们开发了一种随机贪婪启发式和一种遗传算法启发式,用于解决具有重复基因和不等基因含量的基因组的邻接图打包问题。我们还针对无重复基因的基因组中的反转距离和反转与换位距离提出了实现简单、实用性能良好的新算法,并将其与启发式算法相结合,为有重复基因的问题找到了解决方案。我们展示了实验结果,并比较了这些启发式方法与 MSOAR 框架在重排距离问题中的应用。最后,我们将我们的遗传算法启发式应用于真实的基因组数据,以验证其实际用途。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Heuristics
Journal of Heuristics 工程技术-计算机:理论方法
CiteScore
5.80
自引率
0.00%
发文量
19
审稿时长
6 months
期刊介绍: The Journal of Heuristics provides a forum for advancing the state-of-the-art in the theory and practical application of techniques for solving problems approximately that cannot be solved exactly. It fosters the development, understanding, and practical use of heuristic solution techniques for solving business, engineering, and societal problems. It considers the importance of theoretical, empirical, and experimental work related to the development of heuristics. The journal presents practical applications, theoretical developments, decision analysis models that consider issues of rational decision making with limited information, artificial intelligence-based heuristics applied to a wide variety of problems, learning paradigms, and computational experimentation. Officially cited as: J Heuristics Provides a forum for advancing the state-of-the-art in the theory and practical application of techniques for solving problems approximately that cannot be solved exactly. Fosters the development, understanding, and practical use of heuristic solution techniques for solving business, engineering, and societal problems. Considers the importance of theoretical, empirical, and experimental work related to the development of heuristics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信