ReAlign-N: an integrated realignment approach for multiple nucleic acid sequence alignment, combining global and local realignments.

IF 4 Q1 GENETICS & HEREDITY
NAR Genomics and Bioinformatics Pub Date : 2024-12-18 eCollection Date: 2024-12-01 DOI:10.1093/nargab/lqae170
Yixiao Zhai, Tong Zhou, Yanming Wei, Quan Zou, Yansu Wang
{"title":"ReAlign-N: an integrated realignment approach for multiple nucleic acid sequence alignment, combining global and local realignments.","authors":"Yixiao Zhai, Tong Zhou, Yanming Wei, Quan Zou, Yansu Wang","doi":"10.1093/nargab/lqae170","DOIUrl":null,"url":null,"abstract":"<p><p>Ensuring accurate multiple sequence alignment (MSA) is essential for comprehensive biological sequence analysis. However, the complexity of evolutionary relationships often results in variations that generic alignment tools may not adequately address. Realignment is crucial to remedy this issue. Currently, there is a lack of realignment methods tailored for nucleic acid sequences, particularly for lengthy sequences. Thus, there's an urgent need for the development of realignment methods better suited to address these challenges. This study presents ReAlign-N, a realignment method explicitly designed for multiple nucleic acid sequence alignment. ReAlign-N integrates both global and local realignment strategies for improved accuracy. In the global realignment phase, ReAlign-N incorporates K-Band and innovative memory-saving technology into the dynamic programming approach, ensuring high efficiency and minimal memory requirements for large-scale realignment tasks. The local realignment stage employs full matching and entropy scoring methods to identify low-quality regions and conducts realignment through MAFFT. Experimental results demonstrate that ReAlign-N consistently outperforms initial alignments on simulated and real datasets. Furthermore, compared to ReformAlign, the only existing multiple nucleic acid sequence realignment tool, ReAlign-N, exhibits shorter running times and occupies less memory space. The source code and test data for ReAlign-N are available on GitHub (https://github.com/malabz/ReAlign-N).</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"6 4","pages":"lqae170"},"PeriodicalIF":4.0000,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11655299/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NAR Genomics and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/nargab/lqae170","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Ensuring accurate multiple sequence alignment (MSA) is essential for comprehensive biological sequence analysis. However, the complexity of evolutionary relationships often results in variations that generic alignment tools may not adequately address. Realignment is crucial to remedy this issue. Currently, there is a lack of realignment methods tailored for nucleic acid sequences, particularly for lengthy sequences. Thus, there's an urgent need for the development of realignment methods better suited to address these challenges. This study presents ReAlign-N, a realignment method explicitly designed for multiple nucleic acid sequence alignment. ReAlign-N integrates both global and local realignment strategies for improved accuracy. In the global realignment phase, ReAlign-N incorporates K-Band and innovative memory-saving technology into the dynamic programming approach, ensuring high efficiency and minimal memory requirements for large-scale realignment tasks. The local realignment stage employs full matching and entropy scoring methods to identify low-quality regions and conducts realignment through MAFFT. Experimental results demonstrate that ReAlign-N consistently outperforms initial alignments on simulated and real datasets. Furthermore, compared to ReformAlign, the only existing multiple nucleic acid sequence realignment tool, ReAlign-N, exhibits shorter running times and occupies less memory space. The source code and test data for ReAlign-N are available on GitHub (https://github.com/malabz/ReAlign-N).

ReAlign-N:一种综合的多核酸序列比对方法,结合了全局和局部的比对。
确保精确的多序列比对(MSA)是全面的生物序列分析的必要条件。然而,进化关系的复杂性经常导致通用校准工具可能无法充分解决的变化。重新调整是解决这个问题的关键。目前,缺乏适合核酸序列的重组方法,特别是针对长序列的重组方法。因此,迫切需要开发更适合解决这些挑战的调整方法。本研究提出了一种明确设计用于多核酸序列比对的重组方法——ReAlign-N。ReAlign-N集成了全局和局部调整策略,以提高准确性。在全局调整阶段,ReAlign-N将K-Band和创新的内存节省技术纳入动态规划方法,确保大规模调整任务的高效率和最小内存需求。局部重排阶段采用全匹配和熵评分方法识别低质量区域,通过MAFFT进行重排。实验结果表明,ReAlign-N在模拟和真实数据集上都优于初始对齐。此外,与ReformAlign相比,现有唯一的多核酸序列重组工具ReAlign-N具有更短的运行时间和更少的内存空间。ReAlign-N的源代码和测试数据可在GitHub (https://github.com/malabz/ReAlign-N)上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
8.00
自引率
2.20%
发文量
95
审稿时长
15 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信