skandiver:用于识别细胞间移动遗传因子的基于分异的分析工具

Xiaolei Brian Zhang, Grace Oualline, Jim Shaw, Yun William Yu
{"title":"skandiver:用于识别细胞间移动遗传因子的基于分异的分析工具","authors":"Xiaolei Brian Zhang, Grace Oualline, Jim Shaw, Yun William Yu","doi":"arxiv-2406.12064","DOIUrl":null,"url":null,"abstract":"Mobile genetic elements (MGEs) are as ubiquitous in nature as they are varied\nin type, ranging from viral insertions to transposons to incorporated plasmids.\nHorizontal transfer of MGEs across bacterial species may also pose a\nsignificant threat to global health due to their capability to harbour\nantibiotic resistance genes. However, despite cheap and rapid whole genome\nsequencing, the varied nature of MGEs makes it difficult to fully characterize\nthem, and existing methods for detecting MGEs often don't agree on what should\ncount. In this manuscript, we first define and argue in favor of a\ndivergence-based characterization of mobile-genetic elements. Using that\nparadigm, we present skandiver, a tool designed to efficiently detect MGEs from\nwhole genome assemblies without the need for gene annotation or markers.\nskandiver determines mobile elements via genome fragmentation, average\nnucleotide identity (ANI), and divergence time. By building on the scalable\nskani software for ANI computation, skandiver can query hundreds of complete\nassemblies against $>$65,000 representative genomes in a few minutes and 19 GB\nmemory, providing scalable and efficient method for elucidating mobile element\nprofiles in incomplete, uncharacterized genomic sequences. For isolated and\nintegrated large plasmids (>10kbp), skandiver's recall was 48\\% and 47\\%,\nMobileElementFinder was 59\\% and 17\\%, and geNomad was 86\\% and 32\\%,\nrespectively. For isolated large plasmids, skandiver's recall (48\\%) is lower\nthan state-of-the-art reference-based methods geNomad (86\\%) and\nMobileElementFinder (59\\%). However, skandiver achieves higher recall on\nintegrated plasmids and, unlike other methods, without comparing against a\ncurated database, making skandiver suitable for discovery of novel MGEs. Availability: https://github.com/YoukaiFromAccounting/skandiver","PeriodicalId":501070,"journal":{"name":"arXiv - QuanBio - Genomics","volume":"136 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"skandiver: a divergence-based analysis tool for identifying intercellular mobile genetic elements\",\"authors\":\"Xiaolei Brian Zhang, Grace Oualline, Jim Shaw, Yun William Yu\",\"doi\":\"arxiv-2406.12064\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mobile genetic elements (MGEs) are as ubiquitous in nature as they are varied\\nin type, ranging from viral insertions to transposons to incorporated plasmids.\\nHorizontal transfer of MGEs across bacterial species may also pose a\\nsignificant threat to global health due to their capability to harbour\\nantibiotic resistance genes. However, despite cheap and rapid whole genome\\nsequencing, the varied nature of MGEs makes it difficult to fully characterize\\nthem, and existing methods for detecting MGEs often don't agree on what should\\ncount. In this manuscript, we first define and argue in favor of a\\ndivergence-based characterization of mobile-genetic elements. Using that\\nparadigm, we present skandiver, a tool designed to efficiently detect MGEs from\\nwhole genome assemblies without the need for gene annotation or markers.\\nskandiver determines mobile elements via genome fragmentation, average\\nnucleotide identity (ANI), and divergence time. By building on the scalable\\nskani software for ANI computation, skandiver can query hundreds of complete\\nassemblies against $>$65,000 representative genomes in a few minutes and 19 GB\\nmemory, providing scalable and efficient method for elucidating mobile element\\nprofiles in incomplete, uncharacterized genomic sequences. For isolated and\\nintegrated large plasmids (>10kbp), skandiver's recall was 48\\\\% and 47\\\\%,\\nMobileElementFinder was 59\\\\% and 17\\\\%, and geNomad was 86\\\\% and 32\\\\%,\\nrespectively. For isolated large plasmids, skandiver's recall (48\\\\%) is lower\\nthan state-of-the-art reference-based methods geNomad (86\\\\%) and\\nMobileElementFinder (59\\\\%). However, skandiver achieves higher recall on\\nintegrated plasmids and, unlike other methods, without comparing against a\\ncurated database, making skandiver suitable for discovery of novel MGEs. Availability: https://github.com/YoukaiFromAccounting/skandiver\",\"PeriodicalId\":501070,\"journal\":{\"name\":\"arXiv - QuanBio - Genomics\",\"volume\":\"136 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Genomics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2406.12064\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.12064","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

移动遗传因子(MGEs)在自然界无处不在,其类型也多种多样,从病毒插入到转座子,再到整合质粒,不一而足。由于移动遗传因子能够携带抗生素耐药基因,因此它们在细菌物种间的水平转移也可能对全球健康构成重大威胁。然而,尽管全基因组测序既便宜又快速,但由于 MGEs 的性质各不相同,因此很难全面描述它们的特征,而且现有的 MGEs 检测方法往往对哪些应该被计算在内意见不一。在本手稿中,我们首先定义并支持基于发散性的移动遗传因子特征描述。利用这一范式,我们提出了 skandiver,这是一种无需基因注释或标记就能从全基因组组装中高效检测 MGE 的工具。skandiver 通过基因组片段、平均核苷酸同一性(ANI)和分歧时间来确定移动元素。skandiver通过基因组片段确定移动元素的平均核苷酸同一性(ANI)和分歧时间。通过利用可扩展的kani软件进行ANI计算,skandiver可以在几分钟内利用19 GB内存查询数百个完整的基因组组装和价值>65,000美元的代表性基因组,为阐明不完整、未定性基因组序列中的移动元素档案提供了可扩展的高效方法。对于分离的和整合的大质粒(>10kbp),skandiver的召回率分别为48%和47%,MobileElementFinder的召回率分别为59%和17%,geNomad的召回率分别为86%和32%。对于分离出的大质粒,skandiver的召回率(48%)低于最先进的基于参考的方法geNomad(86%)和MobileElementFinder(59%)。然而,skandiver 在整合质粒上的召回率更高,而且与其他方法不同的是,它不需要与已整合的数据库进行比较,这使得 skandiver 适合于发现新的 MGEs。可用性: https://github.com/YoukaiFromAccounting/skandiver
本文章由计算机程序翻译,如有差异,请以英文原文为准。
skandiver: a divergence-based analysis tool for identifying intercellular mobile genetic elements
Mobile genetic elements (MGEs) are as ubiquitous in nature as they are varied in type, ranging from viral insertions to transposons to incorporated plasmids. Horizontal transfer of MGEs across bacterial species may also pose a significant threat to global health due to their capability to harbour antibiotic resistance genes. However, despite cheap and rapid whole genome sequencing, the varied nature of MGEs makes it difficult to fully characterize them, and existing methods for detecting MGEs often don't agree on what should count. In this manuscript, we first define and argue in favor of a divergence-based characterization of mobile-genetic elements. Using that paradigm, we present skandiver, a tool designed to efficiently detect MGEs from whole genome assemblies without the need for gene annotation or markers. skandiver determines mobile elements via genome fragmentation, average nucleotide identity (ANI), and divergence time. By building on the scalable skani software for ANI computation, skandiver can query hundreds of complete assemblies against $>$65,000 representative genomes in a few minutes and 19 GB memory, providing scalable and efficient method for elucidating mobile element profiles in incomplete, uncharacterized genomic sequences. For isolated and integrated large plasmids (>10kbp), skandiver's recall was 48\% and 47\%, MobileElementFinder was 59\% and 17\%, and geNomad was 86\% and 32\%, respectively. For isolated large plasmids, skandiver's recall (48\%) is lower than state-of-the-art reference-based methods geNomad (86\%) and MobileElementFinder (59\%). However, skandiver achieves higher recall on integrated plasmids and, unlike other methods, without comparing against a curated database, making skandiver suitable for discovery of novel MGEs. Availability: https://github.com/YoukaiFromAccounting/skandiver
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信