Gaetan Senelle , Muhammed Rabiu Sahal , Kevin La , Typhaine Billard-Pomares , Julie Marin , Faiza Mougari , Antoine Bridier-Nahmias , Etienne Carbonnelle , Emmanuelle Cambau , Guislaine Refrégier , Christophe Guyeux , Christophe Sola
{"title":"利用新管道“TB- annotator”重建全球结核病历史","authors":"Gaetan Senelle , Muhammed Rabiu Sahal , Kevin La , Typhaine Billard-Pomares , Julie Marin , Faiza Mougari , Antoine Bridier-Nahmias , Etienne Carbonnelle , Emmanuelle Cambau , Guislaine Refrégier , Christophe Guyeux , Christophe Sola","doi":"10.1016/j.tube.2023.102376","DOIUrl":null,"url":null,"abstract":"<div><p><em>Mycobacterium tuberculosis</em><span> complex (MTBC) has a population structure consisting of 9 human and animal lineages<span>. The genomic diversity within these lineages is a pathogenesis factor that affects virulence, transmissibility, host response, and antibiotic resistance. Hence it is important to develop improved information systems for tracking and understanding the spreading and evolution of genomes. We present results obtained thanks to a new informatics platform for computational biology of MTBC, that uses a convenience sample from public/private SRAs, designated as </span></span><em>TB-Annotator</em><span><span>. Version 1 was a first interactive graphic-based web tool based on 15,901 representative genomes. Version 2, still interactive, is a more sophisticated database, developed using the Snakemake Workflow Management System (WMS) that allows an unsupervised global and scalable analysis of the content of the USA National Center for Biotechnology Information Short Read Archives database. This platform analyzes nucleotide variants, the presence/absence of genes, known regions of difference and detect new deletions, the insertion sites of mobile genetic elements, and allows </span>phylogenetic trees to be built, imported in a graphical interface and interactively analyzed between the data and the tree. The objective of </span><em>TB-Annotator</em> is triple: detect recent epidemiological links, reconstruct distant phylogeographical histories as well as perform more complex phenotypic/genotypic Genome-Wide Association Studies (GWAS). In this paper, we compare the various taxonomic SNPs-based labels and hierarchies previously described in recent reference papers for L1, and present a comparative analysis that allows identification of <em>alias</em> and thus provides the basis of a future unifying naming scheme for L1 sublineages. We present a global phylogenetic tree built with RAxML-NG, and one on L2; at the time of writing, we characterized about 200 sublineages, with many new ones; a detail tree for Modern L2 and a hierarchical scheme allowing to facilitate L2 lineage assignment are also presented.</p></div>","PeriodicalId":23383,"journal":{"name":"Tuberculosis","volume":"143 ","pages":"Article 102376"},"PeriodicalIF":2.8000,"publicationDate":"2023-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards the reconstruction of a global TB history using a new pipeline “TB-Annotator\\\"\",\"authors\":\"Gaetan Senelle , Muhammed Rabiu Sahal , Kevin La , Typhaine Billard-Pomares , Julie Marin , Faiza Mougari , Antoine Bridier-Nahmias , Etienne Carbonnelle , Emmanuelle Cambau , Guislaine Refrégier , Christophe Guyeux , Christophe Sola\",\"doi\":\"10.1016/j.tube.2023.102376\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p><em>Mycobacterium tuberculosis</em><span> complex (MTBC) has a population structure consisting of 9 human and animal lineages<span>. The genomic diversity within these lineages is a pathogenesis factor that affects virulence, transmissibility, host response, and antibiotic resistance. Hence it is important to develop improved information systems for tracking and understanding the spreading and evolution of genomes. We present results obtained thanks to a new informatics platform for computational biology of MTBC, that uses a convenience sample from public/private SRAs, designated as </span></span><em>TB-Annotator</em><span><span>. Version 1 was a first interactive graphic-based web tool based on 15,901 representative genomes. Version 2, still interactive, is a more sophisticated database, developed using the Snakemake Workflow Management System (WMS) that allows an unsupervised global and scalable analysis of the content of the USA National Center for Biotechnology Information Short Read Archives database. This platform analyzes nucleotide variants, the presence/absence of genes, known regions of difference and detect new deletions, the insertion sites of mobile genetic elements, and allows </span>phylogenetic trees to be built, imported in a graphical interface and interactively analyzed between the data and the tree. The objective of </span><em>TB-Annotator</em> is triple: detect recent epidemiological links, reconstruct distant phylogeographical histories as well as perform more complex phenotypic/genotypic Genome-Wide Association Studies (GWAS). In this paper, we compare the various taxonomic SNPs-based labels and hierarchies previously described in recent reference papers for L1, and present a comparative analysis that allows identification of <em>alias</em> and thus provides the basis of a future unifying naming scheme for L1 sublineages. We present a global phylogenetic tree built with RAxML-NG, and one on L2; at the time of writing, we characterized about 200 sublineages, with many new ones; a detail tree for Modern L2 and a hierarchical scheme allowing to facilitate L2 lineage assignment are also presented.</p></div>\",\"PeriodicalId\":23383,\"journal\":{\"name\":\"Tuberculosis\",\"volume\":\"143 \",\"pages\":\"Article 102376\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2023-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Tuberculosis\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1472979223000744\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"IMMUNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tuberculosis","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1472979223000744","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"IMMUNOLOGY","Score":null,"Total":0}
Towards the reconstruction of a global TB history using a new pipeline “TB-Annotator"
Mycobacterium tuberculosis complex (MTBC) has a population structure consisting of 9 human and animal lineages. The genomic diversity within these lineages is a pathogenesis factor that affects virulence, transmissibility, host response, and antibiotic resistance. Hence it is important to develop improved information systems for tracking and understanding the spreading and evolution of genomes. We present results obtained thanks to a new informatics platform for computational biology of MTBC, that uses a convenience sample from public/private SRAs, designated as TB-Annotator. Version 1 was a first interactive graphic-based web tool based on 15,901 representative genomes. Version 2, still interactive, is a more sophisticated database, developed using the Snakemake Workflow Management System (WMS) that allows an unsupervised global and scalable analysis of the content of the USA National Center for Biotechnology Information Short Read Archives database. This platform analyzes nucleotide variants, the presence/absence of genes, known regions of difference and detect new deletions, the insertion sites of mobile genetic elements, and allows phylogenetic trees to be built, imported in a graphical interface and interactively analyzed between the data and the tree. The objective of TB-Annotator is triple: detect recent epidemiological links, reconstruct distant phylogeographical histories as well as perform more complex phenotypic/genotypic Genome-Wide Association Studies (GWAS). In this paper, we compare the various taxonomic SNPs-based labels and hierarchies previously described in recent reference papers for L1, and present a comparative analysis that allows identification of alias and thus provides the basis of a future unifying naming scheme for L1 sublineages. We present a global phylogenetic tree built with RAxML-NG, and one on L2; at the time of writing, we characterized about 200 sublineages, with many new ones; a detail tree for Modern L2 and a hierarchical scheme allowing to facilitate L2 lineage assignment are also presented.
期刊介绍:
Tuberculosis is a speciality journal focusing on basic experimental research on tuberculosis, notably on bacteriological, immunological and pathogenesis aspects of the disease. The journal publishes original research and reviews on the host response and immunology of tuberculosis and the molecular biology, genetics and physiology of the organism, however discourages submissions with a meta-analytical focus (for example, articles based on searches of published articles in public electronic databases, especially where there is lack of evidence of the personal involvement of authors in the generation of such material). We do not publish Clinical Case-Studies.
Areas on which submissions are welcomed include:
-Clinical TrialsDiagnostics-
Antimicrobial resistance-
Immunology-
Leprosy-
Microbiology, including microbial physiology-
Molecular epidemiology-
Non-tuberculous Mycobacteria-
Pathogenesis-
Pathology-
Vaccine development.
This Journal does not accept case-reports.
The resurgence of interest in tuberculosis has accelerated the pace of relevant research and Tuberculosis has grown with it, as the only journal dedicated to experimental biomedical research in tuberculosis.