Christoph Stritt, Michelle Reitsma, Ana Maria Garcia Marin, Galo Goig, Anna Dötsch, Sonia Borrell, Christian Beisel, Iñaki Comas, Daniela Brites, Sebastien Gagneux
{"title":"Gene conversion and duplication contribute to genetic variation in an outbreak of <i>Mycobacterium tuberculosis</i>.","authors":"Christoph Stritt, Michelle Reitsma, Ana Maria Garcia Marin, Galo Goig, Anna Dötsch, Sonia Borrell, Christian Beisel, Iñaki Comas, Daniela Brites, Sebastien Gagneux","doi":"10.1099/mgen.0.001396","DOIUrl":null,"url":null,"abstract":"<p><p>Repeats are the most diverse and dynamic but also the least well-understood component of microbial genomes. For all we know, repeat-associated mutations such as duplications, deletions, inversions and gene conversion might be as common as point mutations, but because of short-read myopia and methodological bias, they have received much less attention. Long-read DNA sequencing opens the perspective of resolving repeats and systematically investigating the mutations they induce. For this study, we assembled the genomes of 16 closely related strains of the bacterial pathogen <i>Mycobacterium tuberculosis</i> from Pacific Biosciences HiFi reads, with the aim of characterizing the full spectrum of DNA polymorphisms. We found that complete and accurate genomes can be assembled from HiFi reads, with read size being the main limitation in the presence of duplications. By combining a reference-free pangenome graph with extensive repeat annotation, we identified 110 variants, 58 of which could be assigned to repeat-associated mutational mechanisms such as strand slippage and homologous recombination. Whilst recombination events were less frequent than point mutations, they affected large regions and introduced multiple variants at once, as shown by three gene conversion events and a duplication of 7.3 kb that involved <i>ppe18</i> and <i>ppe57</i>, two genes possibly involved in immune subversion. The vast majority of variants were present in single isolates, such that phylogenetic resolution was only marginally increased when estimating a tree from complete genomes. Our study shows that the contribution of repeat-associated mechanisms of mutation can be similar to that of point mutations at the microevolutionary scale of an outbreak. A large reservoir of unstudied genetic variation in this 'monomorphic' bacterial pathogen awaits investigation.</p>","PeriodicalId":18487,"journal":{"name":"Microbial Genomics","volume":"11 5","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12046097/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microbial Genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1099/mgen.0.001396","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Repeats are the most diverse and dynamic but also the least well-understood component of microbial genomes. For all we know, repeat-associated mutations such as duplications, deletions, inversions and gene conversion might be as common as point mutations, but because of short-read myopia and methodological bias, they have received much less attention. Long-read DNA sequencing opens the perspective of resolving repeats and systematically investigating the mutations they induce. For this study, we assembled the genomes of 16 closely related strains of the bacterial pathogen Mycobacterium tuberculosis from Pacific Biosciences HiFi reads, with the aim of characterizing the full spectrum of DNA polymorphisms. We found that complete and accurate genomes can be assembled from HiFi reads, with read size being the main limitation in the presence of duplications. By combining a reference-free pangenome graph with extensive repeat annotation, we identified 110 variants, 58 of which could be assigned to repeat-associated mutational mechanisms such as strand slippage and homologous recombination. Whilst recombination events were less frequent than point mutations, they affected large regions and introduced multiple variants at once, as shown by three gene conversion events and a duplication of 7.3 kb that involved ppe18 and ppe57, two genes possibly involved in immune subversion. The vast majority of variants were present in single isolates, such that phylogenetic resolution was only marginally increased when estimating a tree from complete genomes. Our study shows that the contribution of repeat-associated mechanisms of mutation can be similar to that of point mutations at the microevolutionary scale of an outbreak. A large reservoir of unstudied genetic variation in this 'monomorphic' bacterial pathogen awaits investigation.
期刊介绍:
Microbial Genomics (MGen) is a fully open access, mandatory open data and peer-reviewed journal publishing high-profile original research on archaea, bacteria, microbial eukaryotes and viruses.