REPrise: de novo interspersed repeat detection using inexact seeding.

IF 4.7 2区 生物学 Q1 GENETICS & HEREDITY
Atsushi Takeda, Daisuke Nonaka, Yuta Imazu, Tsukasa Fukunaga, Michiaki Hamada
{"title":"REPrise: de novo interspersed repeat detection using inexact seeding.","authors":"Atsushi Takeda, Daisuke Nonaka, Yuta Imazu, Tsukasa Fukunaga, Michiaki Hamada","doi":"10.1186/s13100-025-00353-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Interspersed repeats occupy a large part of many eukaryotic genomes, and thus their accurate annotation is essential for various genome analyses. Database-free de novo repeat detection approaches are powerful for annotating genomes that lack well-curated repeat databases. However, existing tools do not yet have sufficient repeat detection performance.</p><p><strong>Results: </strong>In this study, we developed REPrise, a de novo interspersed repeat detection software program based on a seed-and-extension method. Although the algorithm of REPrise is similar to that of RepeatScout, which is currently the de facto standard tool, we incorporated three unique techniques into REPrise: inexact seeding, affine gap scoring and loose masking. Analyses of rice and simulation genome datasets showed that REPrise outperformed RepeatScout in terms of sensitivity, especially when the repeat sequences contained many mutations. Furthermore, when applied to the complete human genome dataset T2T-CHM13, REPrise demonstrated the potential to detect novel repeat sequence families.</p><p><strong>Conclusion: </strong>REPrise can detect interspersed repeats with high sensitivity even in long genomes. Our software enhances repeat annotation in diverse genomic studies, contributing to a deeper understanding of genomic structures.</p>","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"16 1","pages":"16"},"PeriodicalIF":4.7000,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11966803/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mobile DNA","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13100-025-00353-0","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Interspersed repeats occupy a large part of many eukaryotic genomes, and thus their accurate annotation is essential for various genome analyses. Database-free de novo repeat detection approaches are powerful for annotating genomes that lack well-curated repeat databases. However, existing tools do not yet have sufficient repeat detection performance.

Results: In this study, we developed REPrise, a de novo interspersed repeat detection software program based on a seed-and-extension method. Although the algorithm of REPrise is similar to that of RepeatScout, which is currently the de facto standard tool, we incorporated three unique techniques into REPrise: inexact seeding, affine gap scoring and loose masking. Analyses of rice and simulation genome datasets showed that REPrise outperformed RepeatScout in terms of sensitivity, especially when the repeat sequences contained many mutations. Furthermore, when applied to the complete human genome dataset T2T-CHM13, REPrise demonstrated the potential to detect novel repeat sequence families.

Conclusion: REPrise can detect interspersed repeats with high sensitivity even in long genomes. Our software enhances repeat annotation in diverse genomic studies, contributing to a deeper understanding of genomic structures.

REPrise:利用非精确播种技术进行从头穿插重复检测。
背景:在许多真核生物基因组中,穿插重复序列占据了很大一部分,因此准确标注这些重复序列对于各种基因组分析至关重要。无数据库的从头开始重复检测方法对于注释缺乏完善的重复数据库的基因组非常有效。然而,现有的工具还不具备足够的重复检测性能:在这项研究中,我们开发了基于种子-扩展法的从头穿插重复检测软件 REPrise。尽管 REPrise 的算法与 RepeatScout(目前事实上的标准工具)相似,但我们在 REPrise 中加入了三项独特的技术:非精确播种、仿射间隙评分和松散屏蔽。对水稻和模拟基因组数据集的分析表明,REPrise 的灵敏度优于 RepeatScout,尤其是当重复序列包含许多突变时。此外,当应用于完整的人类基因组数据集 T2T-CHM13 时,REPrise 显示出检测新型重复序列家族的潜力:REPrise即使在长基因组中也能高灵敏度地检测到穿插重复序列。我们的软件增强了各种基因组研究中的重复注释,有助于加深对基因组结构的理解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Mobile DNA
Mobile DNA GENETICS & HEREDITY-
CiteScore
8.20
自引率
6.10%
发文量
26
审稿时长
11 weeks
期刊介绍: Mobile DNA is an online, peer-reviewed, open access journal that publishes articles providing novel insights into DNA rearrangements in all organisms, ranging from transposition and other types of recombination mechanisms to patterns and processes of mobile element and host genome evolution. In addition, the journal will consider articles on the utility of mobile genetic elements in biotechnological methods and protocols.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信