Michaela Ruzickova, Jana Palkovicova, Ivo Papousek, Max L Cummins, Steven P Djordjevic, Monika Dolejska
{"title":"在单个基因组序列中存在IncF质粒等位基因的多个变体可能会妨碍使用硅pMLST工具进行精确的复制子序列分型。","authors":"Michaela Ruzickova, Jana Palkovicova, Ivo Papousek, Max L Cummins, Steven P Djordjevic, Monika Dolejska","doi":"10.1128/msystems.01010-24","DOIUrl":null,"url":null,"abstract":"<p><p>IncF plasmids are mobile genetic elements found in bacteria from the <i>Enterobacteriaceae</i> family and often carry critical antibiotic and virulence gene cargo. The classification of IncF plasmids using the plasmid Multi-Locus Sequence Typing (pMLST) tool from the Center for Genomic Epidemiology (CGE; https://www.genomicepidemiology.org/) compares the sequences of IncF alleles against a database to create a plasmid sequence type (ST). Accurate identification of plasmid STs is useful as it enables an assessment of IncF plasmid lineages associated with pandemic enterobacterial STs. Our initial observations showed discrepancies in IncF allele variants reported by pMLST in a collection of 898 <i>Escherichia coli</i> ST131 genomes. To evaluate the limitations of the pMLST tool, we interrogated an in-house and public repository of 70,324 <i>E. coli</i> genomes of various STs and other <i>Enterobacteriaceae</i> genomes (<i>n</i> = 1247). All short-read assemblies and representatives selected for long-read sequencing were used to assess pMLST allele variants and to compare the output of pMLST tool versions. When multiple allele variants occurred in a single bacterial genome, the Python and web versions of the tool randomly selected one allele to report, leading to limited and inaccurate ST identification. Discrepancies were detected in 5,804 of 72,469 genomes (8.01%). Long-read sequencing of 27 genomes confirmed multiple IncF allele variants on one plasmid or two separate IncF plasmids in a single bacterial cell. The pMLST tool was unable to accurately distinguish allele variants and their location on replicons using short-read genome assemblies, or long-read genome assemblies if the same allele variant was present more than once.</p><p><strong>Importance: </strong>Plasmid sequence type is crucial for describing IncF plasmids due to their capacity to carry important antibiotic and virulence gene cargo and consequently due to their association with disease-causing enterobacterial lineages exhibiting resistance to clinically relevant antibiotics in humans and food-producing animals. As a result, precise reporting of IncF allele variants in IncF plasmids is necessary. Comparison of the FAB formulae generated by the pMLST tool with annotated long-read genome assemblies identified inconsistencies, including examples where multiple IncF allele variants were present on the same plasmid but missing in the FAB formula, or in cases where two IncF plasmids were detected in one bacterial cell, and the pMLST output provided information only about one plasmid. Such inconsistencies may cloud interpretation of IncF plasmid replicon type in specific bacterial lineages or inaccurate assumptions of host strain clonality.</p>","PeriodicalId":18819,"journal":{"name":"mSystems","volume":" ","pages":"e0101024"},"PeriodicalIF":5.0000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The presence of multiple variants of IncF plasmid alleles in a single genome sequence can hinder accurate replicon sequence typing using <i>in silico</i> pMLST tools.\",\"authors\":\"Michaela Ruzickova, Jana Palkovicova, Ivo Papousek, Max L Cummins, Steven P Djordjevic, Monika Dolejska\",\"doi\":\"10.1128/msystems.01010-24\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>IncF plasmids are mobile genetic elements found in bacteria from the <i>Enterobacteriaceae</i> family and often carry critical antibiotic and virulence gene cargo. The classification of IncF plasmids using the plasmid Multi-Locus Sequence Typing (pMLST) tool from the Center for Genomic Epidemiology (CGE; https://www.genomicepidemiology.org/) compares the sequences of IncF alleles against a database to create a plasmid sequence type (ST). Accurate identification of plasmid STs is useful as it enables an assessment of IncF plasmid lineages associated with pandemic enterobacterial STs. Our initial observations showed discrepancies in IncF allele variants reported by pMLST in a collection of 898 <i>Escherichia coli</i> ST131 genomes. To evaluate the limitations of the pMLST tool, we interrogated an in-house and public repository of 70,324 <i>E. coli</i> genomes of various STs and other <i>Enterobacteriaceae</i> genomes (<i>n</i> = 1247). All short-read assemblies and representatives selected for long-read sequencing were used to assess pMLST allele variants and to compare the output of pMLST tool versions. When multiple allele variants occurred in a single bacterial genome, the Python and web versions of the tool randomly selected one allele to report, leading to limited and inaccurate ST identification. Discrepancies were detected in 5,804 of 72,469 genomes (8.01%). Long-read sequencing of 27 genomes confirmed multiple IncF allele variants on one plasmid or two separate IncF plasmids in a single bacterial cell. The pMLST tool was unable to accurately distinguish allele variants and their location on replicons using short-read genome assemblies, or long-read genome assemblies if the same allele variant was present more than once.</p><p><strong>Importance: </strong>Plasmid sequence type is crucial for describing IncF plasmids due to their capacity to carry important antibiotic and virulence gene cargo and consequently due to their association with disease-causing enterobacterial lineages exhibiting resistance to clinically relevant antibiotics in humans and food-producing animals. As a result, precise reporting of IncF allele variants in IncF plasmids is necessary. Comparison of the FAB formulae generated by the pMLST tool with annotated long-read genome assemblies identified inconsistencies, including examples where multiple IncF allele variants were present on the same plasmid but missing in the FAB formula, or in cases where two IncF plasmids were detected in one bacterial cell, and the pMLST output provided information only about one plasmid. Such inconsistencies may cloud interpretation of IncF plasmid replicon type in specific bacterial lineages or inaccurate assumptions of host strain clonality.</p>\",\"PeriodicalId\":18819,\"journal\":{\"name\":\"mSystems\",\"volume\":\" \",\"pages\":\"e0101024\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2025-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"mSystems\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1128/msystems.01010-24\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"mSystems","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1128/msystems.01010-24","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
The presence of multiple variants of IncF plasmid alleles in a single genome sequence can hinder accurate replicon sequence typing using in silico pMLST tools.
IncF plasmids are mobile genetic elements found in bacteria from the Enterobacteriaceae family and often carry critical antibiotic and virulence gene cargo. The classification of IncF plasmids using the plasmid Multi-Locus Sequence Typing (pMLST) tool from the Center for Genomic Epidemiology (CGE; https://www.genomicepidemiology.org/) compares the sequences of IncF alleles against a database to create a plasmid sequence type (ST). Accurate identification of plasmid STs is useful as it enables an assessment of IncF plasmid lineages associated with pandemic enterobacterial STs. Our initial observations showed discrepancies in IncF allele variants reported by pMLST in a collection of 898 Escherichia coli ST131 genomes. To evaluate the limitations of the pMLST tool, we interrogated an in-house and public repository of 70,324 E. coli genomes of various STs and other Enterobacteriaceae genomes (n = 1247). All short-read assemblies and representatives selected for long-read sequencing were used to assess pMLST allele variants and to compare the output of pMLST tool versions. When multiple allele variants occurred in a single bacterial genome, the Python and web versions of the tool randomly selected one allele to report, leading to limited and inaccurate ST identification. Discrepancies were detected in 5,804 of 72,469 genomes (8.01%). Long-read sequencing of 27 genomes confirmed multiple IncF allele variants on one plasmid or two separate IncF plasmids in a single bacterial cell. The pMLST tool was unable to accurately distinguish allele variants and their location on replicons using short-read genome assemblies, or long-read genome assemblies if the same allele variant was present more than once.
Importance: Plasmid sequence type is crucial for describing IncF plasmids due to their capacity to carry important antibiotic and virulence gene cargo and consequently due to their association with disease-causing enterobacterial lineages exhibiting resistance to clinically relevant antibiotics in humans and food-producing animals. As a result, precise reporting of IncF allele variants in IncF plasmids is necessary. Comparison of the FAB formulae generated by the pMLST tool with annotated long-read genome assemblies identified inconsistencies, including examples where multiple IncF allele variants were present on the same plasmid but missing in the FAB formula, or in cases where two IncF plasmids were detected in one bacterial cell, and the pMLST output provided information only about one plasmid. Such inconsistencies may cloud interpretation of IncF plasmid replicon type in specific bacterial lineages or inaccurate assumptions of host strain clonality.
mSystemsBiochemistry, Genetics and Molecular Biology-Biochemistry
CiteScore
10.50
自引率
3.10%
发文量
308
审稿时长
13 weeks
期刊介绍:
mSystems™ will publish preeminent work that stems from applying technologies for high-throughput analyses to achieve insights into the metabolic and regulatory systems at the scale of both the single cell and microbial communities. The scope of mSystems™ encompasses all important biological and biochemical findings drawn from analyses of large data sets, as well as new computational approaches for deriving these insights. mSystems™ will welcome submissions from researchers who focus on the microbiome, genomics, metagenomics, transcriptomics, metabolomics, proteomics, glycomics, bioinformatics, and computational microbiology. mSystems™ will provide streamlined decisions, while carrying on ASM''s tradition of rigorous peer review.