Development of the Pneumococcal Genome Library, a core genome multilocus sequence typing scheme, and a taxonomic life identification number barcoding system to investigate and define pneumococcal population structure.

IF 4 2区 生物学 Q1 GENETICS & HEREDITY
Melissa J Jansen van Rensburg, Duncan J Berger, Iman Yassine, David Shaw, Andy Fohrmann, James E Bray, Keith A Jolley, Martin C J Maiden, Angela B Brueggemann
{"title":"Development of the Pneumococcal Genome Library, a core genome multilocus sequence typing scheme, and a taxonomic life identification number barcoding system to investigate and define pneumococcal population structure.","authors":"Melissa J Jansen van Rensburg, Duncan J Berger, Iman Yassine, David Shaw, Andy Fohrmann, James E Bray, Keith A Jolley, Martin C J Maiden, Angela B Brueggemann","doi":"10.1099/mgen.0.001280","DOIUrl":null,"url":null,"abstract":"<p><p>Investigating the genomic epidemiology of major bacterial pathogens is integral to understanding transmission, evolution, colonization, disease, antimicrobial resistance and vaccine impact. Furthermore, the recent accumulation of large numbers of whole genome sequences for many bacterial species enhances the development of robust genome-wide typing schemes to define the overall bacterial population structure and lineages within it. Using the previously published data, we developed the Pneumococcal Genome Library (PGL), a curated dataset of 30 976 genomes and contextual data for carriage and disease pneumococci recovered between 1916 and 2018 in 82 countries. We leveraged the size and diversity of the PGL to develop a core genome multilocus sequence typing (cgMLST) scheme comprised of 1222 loci. Finally, using multilevel single-linkage clustering, we stratified pneumococci into hierarchical clusters based on allelic similarity thresholds and defined these with a taxonomic life identification number (LIN) barcoding system. The PGL, cgMLST scheme and LIN barcodes represent a high-quality genomic resource and fine-scale clustering approaches for the analysis of pneumococcal populations, which support the genomic epidemiology and surveillance of this leading global pathogen.</p>","PeriodicalId":18487,"journal":{"name":"Microbial Genomics","volume":null,"pages":null},"PeriodicalIF":4.0000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11321556/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microbial Genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1099/mgen.0.001280","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Investigating the genomic epidemiology of major bacterial pathogens is integral to understanding transmission, evolution, colonization, disease, antimicrobial resistance and vaccine impact. Furthermore, the recent accumulation of large numbers of whole genome sequences for many bacterial species enhances the development of robust genome-wide typing schemes to define the overall bacterial population structure and lineages within it. Using the previously published data, we developed the Pneumococcal Genome Library (PGL), a curated dataset of 30 976 genomes and contextual data for carriage and disease pneumococci recovered between 1916 and 2018 in 82 countries. We leveraged the size and diversity of the PGL to develop a core genome multilocus sequence typing (cgMLST) scheme comprised of 1222 loci. Finally, using multilevel single-linkage clustering, we stratified pneumococci into hierarchical clusters based on allelic similarity thresholds and defined these with a taxonomic life identification number (LIN) barcoding system. The PGL, cgMLST scheme and LIN barcodes represent a high-quality genomic resource and fine-scale clustering approaches for the analysis of pneumococcal populations, which support the genomic epidemiology and surveillance of this leading global pathogen.

开发肺炎球菌基因组文库、核心基因组多焦点序列分型方案和分类生命识别号条形码系统,以研究和确定肺炎球菌种群结构。
调查主要细菌病原体的基因组流行病学是了解传播、进化、定植、疾病、抗菌药耐药性和疫苗影响所不可或缺的。此外,近来许多细菌物种的全基因组序列的大量积累也促进了强大的全基因组分型计划的发展,以确定细菌的总体种群结构和其中的世系。利用之前公布的数据,我们开发了肺炎球菌基因组库(PGL),这是一个包含 30 976 个基因组和上下文数据的数据集,这些数据是 1916 年至 2018 年期间在 82 个国家回收的携带和疾病肺炎球菌。我们利用 PGL 的规模和多样性开发了一个由 1222 个位点组成的核心基因组多焦点序列分型(cgMLST)方案。最后,我们利用多级单链聚类,根据等位基因相似性阈值将肺炎球菌分层聚类,并通过分类生命识别码(LIN)条形码系统对这些聚类进行定义。PGL、cgMLST 方案和 LIN 条形码代表了用于分析肺炎球菌种群的高质量基因组资源和精细聚类方法,为这一全球主要病原体的基因组流行病学和监测提供了支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Microbial Genomics
Microbial Genomics Medicine-Epidemiology
CiteScore
6.60
自引率
2.60%
发文量
153
审稿时长
12 weeks
期刊介绍: Microbial Genomics (MGen) is a fully open access, mandatory open data and peer-reviewed journal publishing high-profile original research on archaea, bacteria, microbial eukaryotes and viruses.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信