217 closed Salmonella reference genomes using PacBio sequencing.

IF 1.9 Q3 GENETICS & HEREDITY
Yan Luo, Jae Hee Jang, Maria Balkey, Maria Hoffmann
{"title":"217 closed Salmonella reference genomes using PacBio sequencing.","authors":"Yan Luo, Jae Hee Jang, Maria Balkey, Maria Hoffmann","doi":"10.1186/s12863-025-01304-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Whole Genome Sequencing (WGS) is widely used in food safety for the detection, investigation, and control of foodborne bacterial pathogens. However, the WGS data in most public databases, such as the National Center for Biotechnology Information (NCBI), primarily consist of Illumina short reads which lack some important information for repetitive regions, structural variations, and mobile genetic elements, and the genomic location of certain important genes like antimicrobial resistance genes (AMR) and virulence genes. To address this limitation, we have contributed 217 closed circular Salmonella enterica genomes that were generated using PacBio sequencing to the NCBI Pathogen Detection (PD) database and GenBank. This dataset provides a higher level of accuracy to genome representations in the database.</p><p><strong>Data description: </strong>High-quality complete reference genomes generated from PacBio long reads can provide essential details that are not available in draft genomes from short reads. A complete reference genome allows for more accurate data analysis and researchers to establish connections between genome variations and known genes, regulatory elements, and other genomic features. The addition of 217 complete genomes from 78 different Salmonella serovars, each representing either a distinct SNP cluster within the NCBI PD database or a unique strain, significantly enriches the diversity of the reference genome database.</p>","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"15"},"PeriodicalIF":1.9000,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11871702/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC genomic data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s12863-025-01304-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: Whole Genome Sequencing (WGS) is widely used in food safety for the detection, investigation, and control of foodborne bacterial pathogens. However, the WGS data in most public databases, such as the National Center for Biotechnology Information (NCBI), primarily consist of Illumina short reads which lack some important information for repetitive regions, structural variations, and mobile genetic elements, and the genomic location of certain important genes like antimicrobial resistance genes (AMR) and virulence genes. To address this limitation, we have contributed 217 closed circular Salmonella enterica genomes that were generated using PacBio sequencing to the NCBI Pathogen Detection (PD) database and GenBank. This dataset provides a higher level of accuracy to genome representations in the database.

Data description: High-quality complete reference genomes generated from PacBio long reads can provide essential details that are not available in draft genomes from short reads. A complete reference genome allows for more accurate data analysis and researchers to establish connections between genome variations and known genes, regulatory elements, and other genomic features. The addition of 217 complete genomes from 78 different Salmonella serovars, each representing either a distinct SNP cluster within the NCBI PD database or a unique strain, significantly enriches the diversity of the reference genome database.

利用 PacBio 测序技术获得 217 个封闭的沙门氏菌参考基因组。
目的:全基因组测序(WGS)广泛应用于食品安全领域,用于食源性致病菌的检测、调查和控制。然而,大多数公共数据库,如国家生物技术信息中心(NCBI)的WGS数据主要由Illumina短读组成,缺乏重复区域、结构变异和移动遗传元件的一些重要信息,以及某些重要基因如抗菌素耐药基因(AMR)和毒力基因的基因组定位。为了解决这一限制,我们将使用PacBio测序产生的217个闭环肠沙门氏菌基因组提供给NCBI病原体检测(PD)数据库和GenBank。该数据集为数据库中的基因组表示提供了更高的准确性。数据描述:PacBio长读段生成的高质量完整参考基因组可以提供短读段基因组草稿中无法提供的基本细节。一个完整的参考基因组允许更准确的数据分析和研究人员建立基因组变异和已知基因,调控元件和其他基因组特征之间的联系。来自78个不同沙门氏菌血清型的217个完整基因组的加入,极大地丰富了参考基因组数据库的多样性,每个血清型代表NCBI PD数据库中的一个不同的SNP簇或一个独特的菌株。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.90
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信