Advantages of updated WHO mutation catalog combined with existing whole-genome sequencing-based approaches for Mycobacterium tuberculosis resistance prediction.
Yiwang Chen, Xuecong Zhang, Jialei Liang, Qi Jiang, Mijiti Peierdun, Peng Xu, Howard E Takiff, Qian Gao
{"title":"Advantages of updated WHO mutation catalog combined with existing whole-genome sequencing-based approaches for Mycobacterium tuberculosis resistance prediction.","authors":"Yiwang Chen, Xuecong Zhang, Jialei Liang, Qi Jiang, Mijiti Peierdun, Peng Xu, Howard E Takiff, Qian Gao","doi":"10.1186/s13073-025-01458-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The WHO recently released a second edition of the mutation catalog for predicting drug resistance in Mycobacterium tuberculosis (MTB). This study evaluated its effectiveness compared to existing whole-genome sequencing (WGS)-based prediction methods and proposes a novel approach for its optimization.</p><p><strong>Methods: </strong>We tested the accuracy of five tools-the WHO catalog, TB Profiler, SAM-TB, GenTB, and MD-CNN-for predicting drug susceptibility on a global dataset of 36,385 MTB isolates with high-quality phenotypic drug susceptibility testing (DST) and WGS data. By integrating the genotypic DST predictions of these five tools in an ensemble machine learning framework, we developed an improved computational model for MTB drug susceptibility prediction. We then validated the ensemble model on 860 MTB isolates with phenotypic and WGS data collected in Shenzhen, China (2013-2019) and Valencia, Spain (2014-2016).</p><p><strong>Results: </strong>Among the five genotypic DST tools for predicting susceptibility to ten drugs, MD-CNN exhibited the highest overall performance (AUC 92.1%; 95% CI 89.8-94.4%). The WHO catalog demonstrated the highest specificity of 97.3% (95% CI 95.8-98.4%), while TB Profiler had the best sensitivity at 79.5% (95% CI 71.8-86.2%). The ensemble machine learning model (AUC 93.4%; 95% CI 91.4-95.4%) outperformed all of the five individual tools, with a specificity of 95.4% (95% CI 93.0-97.6%) and a sensitivity of 84.1% (95% CI 78.8-88.8%), principally due to considerable improvements in second-line drug resistance predictions (AUC 91.8%; 95% CI 89.6-94.0%).</p><p><strong>Conclusions: </strong>The second edition of the WHO MTB mutation catalog does not, by itself, perform better than existing tools for predicting MTB drug resistance. An integrative approach combining the WHO catalog with other genotypic DST methods significantly enhances prediction accuracy.</p>","PeriodicalId":12645,"journal":{"name":"Genome Medicine","volume":"17 1","pages":"31"},"PeriodicalIF":10.4000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11938600/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome Medicine","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13073-025-01458-0","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The WHO recently released a second edition of the mutation catalog for predicting drug resistance in Mycobacterium tuberculosis (MTB). This study evaluated its effectiveness compared to existing whole-genome sequencing (WGS)-based prediction methods and proposes a novel approach for its optimization.
Methods: We tested the accuracy of five tools-the WHO catalog, TB Profiler, SAM-TB, GenTB, and MD-CNN-for predicting drug susceptibility on a global dataset of 36,385 MTB isolates with high-quality phenotypic drug susceptibility testing (DST) and WGS data. By integrating the genotypic DST predictions of these five tools in an ensemble machine learning framework, we developed an improved computational model for MTB drug susceptibility prediction. We then validated the ensemble model on 860 MTB isolates with phenotypic and WGS data collected in Shenzhen, China (2013-2019) and Valencia, Spain (2014-2016).
Results: Among the five genotypic DST tools for predicting susceptibility to ten drugs, MD-CNN exhibited the highest overall performance (AUC 92.1%; 95% CI 89.8-94.4%). The WHO catalog demonstrated the highest specificity of 97.3% (95% CI 95.8-98.4%), while TB Profiler had the best sensitivity at 79.5% (95% CI 71.8-86.2%). The ensemble machine learning model (AUC 93.4%; 95% CI 91.4-95.4%) outperformed all of the five individual tools, with a specificity of 95.4% (95% CI 93.0-97.6%) and a sensitivity of 84.1% (95% CI 78.8-88.8%), principally due to considerable improvements in second-line drug resistance predictions (AUC 91.8%; 95% CI 89.6-94.0%).
Conclusions: The second edition of the WHO MTB mutation catalog does not, by itself, perform better than existing tools for predicting MTB drug resistance. An integrative approach combining the WHO catalog with other genotypic DST methods significantly enhances prediction accuracy.
背景:世卫组织最近发布了预测结核分枝杆菌耐药突变目录的第二版。本研究将其与现有基于全基因组测序(WGS)的预测方法进行了比较,并提出了一种新的优化方法。方法:我们测试了五种工具(WHO目录、TB Profiler、SAM-TB、GenTB和md - cnn)在36,385株结核分枝杆菌的全球数据集上预测药物敏感性的准确性,这些数据具有高质量的表型药敏试验(DST)和WGS数据。通过将这五种工具的基因型DST预测整合到一个集成机器学习框架中,我们开发了一个改进的MTB药物敏感性预测计算模型。然后,我们利用在中国深圳(2013-2019年)和西班牙瓦伦西亚(2014-2016年)收集的860株结核分枝杆菌的表型和WGS数据验证了集合模型。结果:在预测10种药物敏感性的5种基因型DST工具中,MD-CNN的综合效能最高(AUC为92.1%;95% ci 89.8-94.4%)。WHO目录的最高特异性为97.3% (95% CI 95.8-98.4%),而TB Profiler的最佳敏感性为79.5% (95% CI 71.8-86.2%)。集成机器学习模型(AUC 93.4%;95% CI 91.4-95.4%)优于所有五种单独的工具,特异性为95.4% (95% CI 93.0-97.6%),敏感性为84.1% (95% CI 78.8-88.8%),主要是由于二线耐药预测的显著改善(AUC 91.8%;95% ci 89.6-94.0%)。结论:世卫组织MTB突变目录第二版本身并不比现有预测MTB耐药性的工具表现更好。将世卫组织目录与其他基因型DST方法相结合的综合方法可显著提高预测准确性。
期刊介绍:
Genome Medicine is an open access journal that publishes outstanding research applying genetics, genomics, and multi-omics to understand, diagnose, and treat disease. Bridging basic science and clinical research, it covers areas such as cancer genomics, immuno-oncology, immunogenomics, infectious disease, microbiome, neurogenomics, systems medicine, clinical genomics, gene therapies, precision medicine, and clinical trials. The journal publishes original research, methods, software, and reviews to serve authors and promote broad interest and importance in the field.