Bridging the gap between molecular and genomic epidemiology in tuberculosis: inferring MIRU-VNTR patterns from genomic data.

IF 6.1 2区医学 Q1 MICROBIOLOGY

Journal of Clinical Microbiology Pub Date : 2024-09-11 Epub Date: 2024-08-13 DOI:10.1128/jcm.00741-24

Sergio Buenestado-Serrano, Miguel Martínez-Lirola, Anzaan Dippenaar, Amadeo Sanz-Pérez, José Antonio Garrido-Cárdenas, Ana Belén Esteban-García, Adriana Justine García-Toledo, Cristina Rodríguez-Grande, Marta Herranz-Martín, Sheri M Saleeb, Patricia Muñoz, Robin M Warren, Laura Pérez-Lago, Darío García de Viedma

{"title":"Bridging the gap between molecular and genomic epidemiology in tuberculosis: inferring MIRU-VNTR patterns from genomic data.","authors":"Sergio Buenestado-Serrano, Miguel Martínez-Lirola, Anzaan Dippenaar, Amadeo Sanz-Pérez, José Antonio Garrido-Cárdenas, Ana Belén Esteban-García, Adriana Justine García-Toledo, Cristina Rodríguez-Grande, Marta Herranz-Martín, Sheri M Saleeb, Patricia Muñoz, Robin M Warren, Laura Pérez-Lago, Darío García de Viedma","doi":"10.1128/jcm.00741-24","DOIUrl":null,"url":null,"abstract":"The transition from MIRU-VNTR-based epidemiology studies in tuberculosis (TB) to genomic epidemiology has transformed how we track transmission. However, short-read sequencing is poor at analyzing repetitive regions such as the MIRU-VNTR loci. This causes a gap between the new genomic data and the large amount of information stored in historical databases. Long-read sequencing could bridge this knowledge gap by allowing analysis of repetitive regions. However, the feasibility of extracting MIRU-VNTRs from long reads and linking them to historical data has not been evaluated. In our study, an in silico arm, consisting of inference of MIRU patterns from long-read sequences (using MIRUReader program), was compared with an experimental arm, involving standard amplification and fragment sizing. We analyzed overall performance on 39 isolates from South Africa and confirmed reproducibility in a sample enriched with 62 clustered cases from Spain. Finally, we ran 25 consecutive incident cases, demonstrating the feasibility of correctly assigning new clustered/orphan cases by linking data inferred from genomic analysis to MIRU-VNTR databases. Of the 3,024 loci analyzed, only 11 discrepancies (0.36%) were found between the two arms: three attributed to experimental error and eight to misassigned alleles from long-read sequencing. A second round of analysis of these discrepancies resulted in agreement between the experimental and in silico arms in all but one locus. Adjusting the MIRUReader program code allowed us to flag potential in silico misassignments due to suboptimal coverage or unfixed double alleles. Our study indicates that long-read sequencing could help address potential chronological and geographical gaps arising from the transition from molecular to genomic epidemiology of tuberculosis.Importance: The transition from molecular epidemiology in tuberculosis (TB), based on the analysis of repetitive regions (VNTR-based genotyping), to genomic epidemiology transforms in the precision with which we track transmission. However, short-read sequencing, the most common method for performing genomic analysis, is poor at analyzing repetitive regions. This means that we face a gap between the new genomic data and the large amount of information stored in historical databases, which is also an obstacle to cross-national surveillance involving settings where only molecular data are available. Long-read sequencing could help bridge this knowledge gap by allowing analysis of repetitive regions. Our study demonstrates that MIRU-VNTR patterns can be successfully inferred from long-read sequences, allowing the correct assignment of new cases as clustered/orphan by linking new data extracted from genomic analysis to historical MIRU-VNTR databases. Our data may provide a starting point for bridging the knowledge gap between the molecular and genomic eras in tuberculosis epidemiology.","PeriodicalId":15511,"journal":{"name":"Journal of Clinical Microbiology","volume":" ","pages":"e0074124"},"PeriodicalIF":6.1000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11389143/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Microbiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1128/jcm.00741-24","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/13 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"MICROBIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

The transition from MIRU-VNTR-based epidemiology studies in tuberculosis (TB) to genomic epidemiology has transformed how we track transmission. However, short-read sequencing is poor at analyzing repetitive regions such as the MIRU-VNTR loci. This causes a gap between the new genomic data and the large amount of information stored in historical databases. Long-read sequencing could bridge this knowledge gap by allowing analysis of repetitive regions. However, the feasibility of extracting MIRU-VNTRs from long reads and linking them to historical data has not been evaluated. In our study, an in silico arm, consisting of inference of MIRU patterns from long-read sequences (using MIRUReader program), was compared with an experimental arm, involving standard amplification and fragment sizing. We analyzed overall performance on 39 isolates from South Africa and confirmed reproducibility in a sample enriched with 62 clustered cases from Spain. Finally, we ran 25 consecutive incident cases, demonstrating the feasibility of correctly assigning new clustered/orphan cases by linking data inferred from genomic analysis to MIRU-VNTR databases. Of the 3,024 loci analyzed, only 11 discrepancies (0.36%) were found between the two arms: three attributed to experimental error and eight to misassigned alleles from long-read sequencing. A second round of analysis of these discrepancies resulted in agreement between the experimental and in silico arms in all but one locus. Adjusting the MIRUReader program code allowed us to flag potential in silico misassignments due to suboptimal coverage or unfixed double alleles. Our study indicates that long-read sequencing could help address potential chronological and geographical gaps arising from the transition from molecular to genomic epidemiology of tuberculosis.

Importance: The transition from molecular epidemiology in tuberculosis (TB), based on the analysis of repetitive regions (VNTR-based genotyping), to genomic epidemiology transforms in the precision with which we track transmission. However, short-read sequencing, the most common method for performing genomic analysis, is poor at analyzing repetitive regions. This means that we face a gap between the new genomic data and the large amount of information stored in historical databases, which is also an obstacle to cross-national surveillance involving settings where only molecular data are available. Long-read sequencing could help bridge this knowledge gap by allowing analysis of repetitive regions. Our study demonstrates that MIRU-VNTR patterns can be successfully inferred from long-read sequences, allowing the correct assignment of new cases as clustered/orphan by linking new data extracted from genomic analysis to historical MIRU-VNTR databases. Our data may provide a starting point for bridging the knowledge gap between the molecular and genomic eras in tuberculosis epidemiology.

查看原文本刊更多论文

缩小结核病分子流行病学与基因组流行病学之间的差距：从基因组数据推断 MIRU-VNTR 模式。

从基于 MIRU-VNTR 的结核病（TB）流行病学研究到基因组流行病学研究的转变改变了我们追踪传播的方式。然而，短线程测序在分析 MIRU-VNTR 位点等重复性区域方面表现不佳。这就造成了新基因组数据与历史数据库中存储的大量信息之间的差距。长线程测序可以分析重复区域，从而弥补这一知识差距。然而，从长读数中提取 MIRU-VNTR 并将其与历史数据联系起来的可行性尚未得到评估。在我们的研究中，包括从长读数序列（使用 MIRUReader 程序）中推断 MIRU 模式的硅学部分与涉及标准扩增和片段大小的实验部分进行了比较。我们分析了来自南非的 39 个分离株的总体表现，并在西班牙的 62 个聚类病例样本中证实了可重复性。最后，我们对 25 个连续的发病病例进行了分析，证明了通过将基因组分析推断出的数据与 MIRU-VNTR 数据库相连接来正确分配新的聚类病例/孤儿病例的可行性。在分析的 3024 个位点中，两组之间只发现了 11 个差异（0.36%）：3 个是由于实验错误，8 个是由于长线程测序中的等位基因分配错误。对这些差异进行第二轮分析后，除一个基因位点外，实验臂和硅学臂在其他所有基因位点上的结果都一致。通过调整 MIRUReader 程序代码，我们可以标记出由于覆盖率不理想或未固定的双等位基因而导致的潜在硅学错误定位。我们的研究表明，长读数测序有助于解决结核病流行病学从分子流行病学向基因组学过渡过程中可能出现的时间和地理差距：结核病（TB）的分子流行病学以重复区域分析（基于 VNTR 的基因分型）为基础，从分子流行病学向基因组流行病学的过渡改变了我们追踪传播的精确度。然而，最常用的基因组分析方法--短线程测序法，在分析重复性区域方面表现不佳。这意味着我们面临着新的基因组数据与历史数据库中存储的大量信息之间的差距，这也是在只有分子数据的情况下进行跨国监测的障碍。长读数测序可以对重复区域进行分析，有助于弥补这一知识差距。我们的研究表明，MIRU-VNTR 模式可以成功地从长线程序列中推断出来，通过将从基因组分析中提取的新数据与历史 MIRU-VNTR 数据库联系起来，可以正确地将新病例分配为聚类/孤岛病例。我们的数据可为弥合结核病流行病学中分子时代与基因组时代之间的知识差距提供一个起点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Clinical Microbiology 医学-微生物学

CiteScore

17.10

自引率

4.30%

发文量

347

审稿时长

3 months

期刊介绍： The Journal of Clinical Microbiology® disseminates the latest research concerning the laboratory diagnosis of human and animal infections, along with the laboratory's role in epidemiology and the management of infectious diseases.