Joshua J Carter, Timothy M Walker, A Sarah Walker, Michael G Whitfield, Glenn P Morlock, Charlotte I Lynch, Dylan Adlard, Timothy E A Peto, James E Posey, Derrick W Crook, Philip W Fowler
{"title":"Prediction of pyrazinamide resistance in <i>Mycobacterium tuberculosis</i> using structure-based machine-learning approaches.","authors":"Joshua J Carter, Timothy M Walker, A Sarah Walker, Michael G Whitfield, Glenn P Morlock, Charlotte I Lynch, Dylan Adlard, Timothy E A Peto, James E Posey, Derrick W Crook, Philip W Fowler","doi":"10.1093/jacamr/dlae037","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Pyrazinamide is one of four first-line antibiotics used to treat tuberculosis; however, antibiotic susceptibility testing for pyrazinamide is challenging. Resistance to pyrazinamide is primarily driven by genetic variation in <i>pncA</i>, encoding an enzyme that converts pyrazinamide into its active form.</p><p><strong>Methods: </strong>We curated a dataset of 664 non-redundant, missense amino acid mutations in PncA with associated high-confidence phenotypes from published studies and then trained three different machine-learning models to predict pyrazinamide resistance. All models had access to a range of protein structural-, chemical- and sequence-based features.</p><p><strong>Results: </strong>The best model, a gradient-boosted decision tree, achieved a sensitivity of 80.2% and a specificity of 76.9% on the hold-out test dataset. The clinical performance of the models was then estimated by predicting the binary pyrazinamide resistance phenotype of 4027 samples harbouring 367 unique missense mutations in <i>pncA</i> derived from 24 231 clinical isolates.</p><p><strong>Conclusions: </strong>This work demonstrates how machine learning can enhance the sensitivity/specificity of pyrazinamide resistance prediction in genetics-based clinical microbiology workflows, highlights novel mutations for future biochemical investigation, and is a proof of concept for using this approach in other drugs.</p>","PeriodicalId":14594,"journal":{"name":"JAC-Antimicrobial Resistance","volume":"6 2","pages":"dlae037"},"PeriodicalIF":3.7000,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10946228/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAC-Antimicrobial Resistance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jacamr/dlae037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/4/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Pyrazinamide is one of four first-line antibiotics used to treat tuberculosis; however, antibiotic susceptibility testing for pyrazinamide is challenging. Resistance to pyrazinamide is primarily driven by genetic variation in pncA, encoding an enzyme that converts pyrazinamide into its active form.
Methods: We curated a dataset of 664 non-redundant, missense amino acid mutations in PncA with associated high-confidence phenotypes from published studies and then trained three different machine-learning models to predict pyrazinamide resistance. All models had access to a range of protein structural-, chemical- and sequence-based features.
Results: The best model, a gradient-boosted decision tree, achieved a sensitivity of 80.2% and a specificity of 76.9% on the hold-out test dataset. The clinical performance of the models was then estimated by predicting the binary pyrazinamide resistance phenotype of 4027 samples harbouring 367 unique missense mutations in pncA derived from 24 231 clinical isolates.
Conclusions: This work demonstrates how machine learning can enhance the sensitivity/specificity of pyrazinamide resistance prediction in genetics-based clinical microbiology workflows, highlights novel mutations for future biochemical investigation, and is a proof of concept for using this approach in other drugs.