Accurate and reproducible whole-genome genotyping for bacterial genomic surveillance with Nanopore sequencing data.

IF 5.4 2区医学 Q1 MICROBIOLOGY

Journal of Clinical Microbiology Pub Date : 2025-07-09 Epub Date: 2025-06-13 DOI:10.1128/jcm.00369-25

K Prior, K Becker, C Brandt, A Cabal Rosel, J Dabernig-Heinz, C Kohler, M Lohde, W Ruppitsch, F Schuler, G E Wagner, A Mellmann

{"title":"Accurate and reproducible whole-genome genotyping for bacterial genomic surveillance with Nanopore sequencing data.","authors":"K Prior, K Becker, C Brandt, A Cabal Rosel, J Dabernig-Heinz, C Kohler, M Lohde, W Ruppitsch, F Schuler, G E Wagner, A Mellmann","doi":"10.1128/jcm.00369-25","DOIUrl":null,"url":null,"abstract":"Despite recent advances in error rate reduction, until recently, Oxford Nanopore Technologies (ONT) sequences lacked the accuracy required for fine-scale bacterial genomic analysis. Here, recent software improvements of ONT and the ONT-core-genome multilocus sequence typing (cgMLST)-Polisher within the SeqSphere+ software were evaluated. We used short-read (Illumina) and long-read ONT sequences of 80 multidrug-resistant organisms (MDROs) for benchmarking. Illumina reads were de novo assembled using SKESA. For ONT, Dorado Super Accurate (SUP) model v.4.3 or v.5.0 basecalled reads were assembled with Flye and then polished with Medaka v.1.12 m4.3 or Medaka v.2.0 bacterial methylation model. In addition, the ONT-cgMLST-Polisher was run over all assemblies. The \"ground truth\" (GT) hybrid assemblies were created using Hybracter v.0.10.0. Sixteen isolates from four species out of the original 80 isolates were sent to six laboratories for a ring trial. The 80 MDROs basecalled with SUP m4.3 had an average cgMLST allele distance (AD) to the GT of 4.94 with Medaka v.1.12 and 1.78 with Medaka v.2.0, respectively. After further polishing the Medaka v.2.0 data with the ONT-cgMLST-Polisher, the AD dropped to 0.09. Using data basecalled with SUP m5.0 with Medaka v.2.0 further reduced the AD significantly to 0.04. While the ring trial data basecalled with Dorado SUP m4.3 showed more variability and insufficient results for some samples, model 5.0 data resulted in average ADs of 0.36 and 0.17 without and with the ONT-cgMLST-Polisher, respectively. In conclusion, recent ONT Dorado and Medaka models combined with the ONT-cgMLST-Polisher improved ONT sequencing accuracy and made it sufficiently reproducible for genomic surveillance of bacteria.IMPORTANCEONT sequencing methodology is especially attractive for small and medium-sized laboratories due to its relatively low capital investment and price per sample consumable costs. However, until recently, it lacked accuracy and reproducibility for bacterial genomic genotyping. Here, we present an evaluation of the most recent ONT bioinformatic (basecalling and polishing of consensus) improvements and a new ONT-cgMLST-Polisher tool. We demonstrate that by applying those procedures, ONT whole-genome genotyping-based surveillance of bacteria is finally accurate and reproducible enough for routine application even in small laboratories.","PeriodicalId":15511,"journal":{"name":"Journal of Clinical Microbiology","volume":" ","pages":"e0036925"},"PeriodicalIF":5.4000,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12239720/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Microbiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1128/jcm.00369-25","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/13 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"MICROBIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Despite recent advances in error rate reduction, until recently, Oxford Nanopore Technologies (ONT) sequences lacked the accuracy required for fine-scale bacterial genomic analysis. Here, recent software improvements of ONT and the ONT-core-genome multilocus sequence typing (cgMLST)-Polisher within the SeqSphere⁺ software were evaluated. We used short-read (Illumina) and long-read ONT sequences of 80 multidrug-resistant organisms (MDROs) for benchmarking. Illumina reads were de novo assembled using SKESA. For ONT, Dorado Super Accurate (SUP) model v.4.3 or v.5.0 basecalled reads were assembled with Flye and then polished with Medaka v.1.12 m4.3 or Medaka v.2.0 bacterial methylation model. In addition, the ONT-cgMLST-Polisher was run over all assemblies. The "ground truth" (GT) hybrid assemblies were created using Hybracter v.0.10.0. Sixteen isolates from four species out of the original 80 isolates were sent to six laboratories for a ring trial. The 80 MDROs basecalled with SUP m4.3 had an average cgMLST allele distance (AD) to the GT of 4.94 with Medaka v.1.12 and 1.78 with Medaka v.2.0, respectively. After further polishing the Medaka v.2.0 data with the ONT-cgMLST-Polisher, the AD dropped to 0.09. Using data basecalled with SUP m5.0 with Medaka v.2.0 further reduced the AD significantly to 0.04. While the ring trial data basecalled with Dorado SUP m4.3 showed more variability and insufficient results for some samples, model 5.0 data resulted in average ADs of 0.36 and 0.17 without and with the ONT-cgMLST-Polisher, respectively. In conclusion, recent ONT Dorado and Medaka models combined with the ONT-cgMLST-Polisher improved ONT sequencing accuracy and made it sufficiently reproducible for genomic surveillance of bacteria.IMPORTANCEONT sequencing methodology is especially attractive for small and medium-sized laboratories due to its relatively low capital investment and price per sample consumable costs. However, until recently, it lacked accuracy and reproducibility for bacterial genomic genotyping. Here, we present an evaluation of the most recent ONT bioinformatic (basecalling and polishing of consensus) improvements and a new ONT-cgMLST-Polisher tool. We demonstrate that by applying those procedures, ONT whole-genome genotyping-based surveillance of bacteria is finally accurate and reproducible enough for routine application even in small laboratories.

查看原文本刊更多论文

准确和可重复的全基因组基因分型细菌基因组监测与纳米孔测序数据。

尽管最近在降低错误率方面取得了进展，但直到最近，牛津纳米孔技术（ONT）序列缺乏精细细菌基因组分析所需的准确性。本文对SeqSphere+软件中ONT和ONT核心-基因组多位点序列分型(cgMLST)-Polisher的最新软件改进进行了评估。我们使用80个多药耐药生物（mdro）的短读（Illumina）和长读ONT序列进行基准测试。使用SKESA重新组装Illumina reads。对于ONT，使用Flye组装Dorado Super Accurate （SUP）模型v.4.3或v.5.0，称为reads，然后使用Medaka v.1.12 m4.3或Medaka v.2.0细菌甲基化模型进行抛光。此外，ONT-cgMLST-Polisher在所有组件上运行。“ground truth”（GT）混合组件使用Hybracter v.0.10.0创建。在最初的80株分离株中，来自4个物种的16株分离株被送到6个实验室进行环形试验。以SUP m4.3为基础的80个MDROs与Medaka v.1.12和Medaka v.2.0的平均cgMLST等位基因距离（AD）分别为4.94和1.78。在使用ONT-cgMLST-Polisher进一步抛光Medaka v.2.0数据后，AD降至0.09。在Medaka v.2.0中使用名为SUP m5.0的数据库进一步将AD显著降低到0.04。而基于Dorado SUP m4.3的环状试验数据显示出更多的可变性，并且对一些样本的结果不充分，而模型5.0数据显示，在没有ONT-cgMLST-Polisher和使用ONT-cgMLST-Polisher的情况下，平均ADs分别为0.36和0.17。总之，最近的ONT Dorado和Medaka模型与ONT- cgmlst - polisher相结合，提高了ONT测序的准确性，并使其在细菌基因组监测中具有足够的可重复性。由于其相对较低的资本投资和每个样品消耗品的价格，IMPORTANCEONT测序方法对中小型实验室特别有吸引力。然而，直到最近，它缺乏准确性和可重复性的细菌基因组基因分型。在这里，我们介绍了最新的ONT生物信息学（共识的基础调用和抛光）改进和新的ONT- cgmlst - polisher工具的评估。我们证明，通过应用这些程序，基于细菌的ONT全基因组基因分型监测最终是准确的和可重复的，即使在小型实验室也足够常规应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Clinical Microbiology 医学-微生物学

CiteScore

17.10

自引率

4.30%

发文量

347

审稿时长

3 months

期刊介绍： The Journal of Clinical Microbiology® disseminates the latest research concerning the laboratory diagnosis of human and animal infections, along with the laboratory's role in epidemiology and the management of infectious diseases.