Look4LTRs: a Long terminal repeat retrotransposon detection tool capable of cross species studies and discovering recently nested repeats

IF 4.7 2区 生物学 Q1 GENETICS & HEREDITY
Anthony B. Garza, Emmanuelle Lerat, Hani Z. Girgis
{"title":"Look4LTRs: a Long terminal repeat retrotransposon detection tool capable of cross species studies and discovering recently nested repeats","authors":"Anthony B. Garza, Emmanuelle Lerat, Hani Z. Girgis","doi":"10.1186/s13100-024-00317-w","DOIUrl":null,"url":null,"abstract":"Plant genomes include large numbers of transposable elements. One particular type of these elements is flanked by two Long Terminal Repeats (LTRs) and can translocate using RNA. Such elements are known as LTR-retrotransposons; they are the most abundant type of transposons in plant genomes. They have many important functions involving gene regulation and the rise of new genes and pseudo genes in response to severe stress. Additionally, LTR-retrotransposons have several applications in biotechnology. Due to the abundance and the importance of LTR-retrotransposons, multiple computational tools have been developed for their detection. However, none of these tools take advantages of the availability of related genomes; they process one chromosome at a time. Further, recently nested LTR-retrotransposons (multiple elements of the same family are inserted into each other) cannot be annotated accurately — or cannot be annotated at all — by the currently available tools. Motivated to overcome these two limitations, we built Look4LTRs, which can annotate LTR-retrotransposons in multiple related genomes simultaneously and discover recently nested elements. The methodology of Look4LTRs depends on techniques imported from the signal-processing field, graph algorithms, and machine learning with a minimal use of alignment algorithms. Four plant genomes were used in developing Look4LTRs and eight plant genomes for evaluating it in contrast to three related tools. Look4LTRs is the fastest while maintaining better or comparable F1 scores (the harmonic average of recall and precision) to those obtained by the other tools. Our results demonstrate the added benefit of annotating LTR-retrotransposons in multiple related genomes simultaneously and the ability to discover recently nested elements. Expert human manual examination of six elements — not included in the ground truth — revealed that three elements belong to known families and two elements are likely from new families. With respect to examining recently nested LTR-retrotransposons, three out of five were confirmed to be valid elements. Look4LTRs — with its speed, accuracy, and novel features — represents a true advancement in the annotation of LTR-retrotransposons, opening the door to many studies focused on understanding their functions in plants.","PeriodicalId":18854,"journal":{"name":"Mobile DNA","volume":"57 1","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mobile DNA","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13100-024-00317-w","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Plant genomes include large numbers of transposable elements. One particular type of these elements is flanked by two Long Terminal Repeats (LTRs) and can translocate using RNA. Such elements are known as LTR-retrotransposons; they are the most abundant type of transposons in plant genomes. They have many important functions involving gene regulation and the rise of new genes and pseudo genes in response to severe stress. Additionally, LTR-retrotransposons have several applications in biotechnology. Due to the abundance and the importance of LTR-retrotransposons, multiple computational tools have been developed for their detection. However, none of these tools take advantages of the availability of related genomes; they process one chromosome at a time. Further, recently nested LTR-retrotransposons (multiple elements of the same family are inserted into each other) cannot be annotated accurately — or cannot be annotated at all — by the currently available tools. Motivated to overcome these two limitations, we built Look4LTRs, which can annotate LTR-retrotransposons in multiple related genomes simultaneously and discover recently nested elements. The methodology of Look4LTRs depends on techniques imported from the signal-processing field, graph algorithms, and machine learning with a minimal use of alignment algorithms. Four plant genomes were used in developing Look4LTRs and eight plant genomes for evaluating it in contrast to three related tools. Look4LTRs is the fastest while maintaining better or comparable F1 scores (the harmonic average of recall and precision) to those obtained by the other tools. Our results demonstrate the added benefit of annotating LTR-retrotransposons in multiple related genomes simultaneously and the ability to discover recently nested elements. Expert human manual examination of six elements — not included in the ground truth — revealed that three elements belong to known families and two elements are likely from new families. With respect to examining recently nested LTR-retrotransposons, three out of five were confirmed to be valid elements. Look4LTRs — with its speed, accuracy, and novel features — represents a true advancement in the annotation of LTR-retrotransposons, opening the door to many studies focused on understanding their functions in plants.
Look4LTRs:长末端重复反转座子检测工具,能够进行跨物种研究并发现最近嵌套的重复序列
植物基因组包括大量转座元件。其中一种特殊的转座元件两侧有两个长末端重复序列(LTR),可以利用 RNA 进行转座。这类元件被称为 LTR-转座子;它们是植物基因组中最丰富的转座子类型。它们具有许多重要功能,包括基因调控以及在应对严重胁迫时产生新基因和伪基因。此外,LTR-反转座子在生物技术中也有多种应用。由于 LTR 反转座子的丰富性和重要性,已经开发出多种用于检测它们的计算工具。但是,这些工具都没有利用相关基因组的优势;它们一次只处理一条染色体。此外,目前可用的工具无法准确注释或根本无法注释最近嵌套的 LTR 反转座子(同一家族的多个元素相互插入)。为了克服这两个局限性,我们建立了 Look4LTRs,它可以同时注释多个相关基因组中的 LTR 反转座子,并发现最近嵌套的元件。Look4LTRs 的方法依赖于从信号处理领域引进的技术、图算法和机器学习,只使用了极少量的比对算法。开发 Look4LTRs 时使用了四个植物基因组,评估时使用了八个植物基因组,与三个相关工具进行了对比。Look4LTRs 的速度最快,同时与其他工具获得的 F1 分数(召回率和精确率的调和平均值)相比,Look4LTRs 保持了更好或相当的 F1 分数。我们的研究结果证明了同时注释多个相关基因组中的 LTR 反转座子的额外好处,以及发现最近嵌套元素的能力。专家人工检查了六个元素(不包括在基本事实中),发现三个元素属于已知的家族,两个元素可能来自新的家族。在检查最近嵌套的 LTR 反转座子方面,五个元素中有三个被确认为有效元素。Look4LTRs 的速度、准确性和新颖性代表了 LTR 反转座子注释的真正进步,为了解其在植物中的功能的许多研究打开了大门。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Mobile DNA
Mobile DNA GENETICS & HEREDITY-
CiteScore
8.20
自引率
6.10%
发文量
26
审稿时长
11 weeks
期刊介绍: Mobile DNA is an online, peer-reviewed, open access journal that publishes articles providing novel insights into DNA rearrangements in all organisms, ranging from transposition and other types of recombination mechanisms to patterns and processes of mobile element and host genome evolution. In addition, the journal will consider articles on the utility of mobile genetic elements in biotechnological methods and protocols.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信