{"title":"Prediction of Mycobacterium tuberculosis cell wall permeability using machine learning methods","authors":"Aritra Banerjee, Anju Sharma, Pradnya Kamble, Prabha Garg","doi":"10.1007/s11030-024-10952-3","DOIUrl":null,"url":null,"abstract":"<div><p>Tuberculosis (TB) caused by the bacteria <i>Mycobacterium tuberculosis</i> (<i>M. tb</i>), continues to pose a significant worldwide health threat. The advent of drug-resistant strains of the disease highlights the critical need for novel treatments. The unique cell wall of <i>M. tb</i> provides an extra layer of protection for the bacteria and hence only compounds that can penetrate this barrier can reach their targets within the bacterial cell wall. The creation of a reliable machine learning (ML) model to predict the mycobacterial cell wall permeability of small molecules is presented in this work and four ML algorithms, including Random Forest, Support Vector Machines (SVM), k-nearest Neighbour (k-NN) and Logistic Regression were trained on a dataset of 5368 compounds. RDKit and Mordred toolkits were used to calculate features. To determine the most effective model, various performance metrics were used such as accuracy, precision, recall, F1 score, and area under the receiver operating characteristic curve. The best-performing model was further refined with hyperparameter tuning and tenfold cross-validation. The SVM model with filtering outperformed the other machine learning models and demonstrated 80.26% and 81.13% accuracy on the test and validation datasets, respectively. The study also provided insights into the molecular descriptors that play the most important role in predicting the ability of a molecule to pass the <i>M. tb</i> cell wall, which could guide future compound design. The model is available at https://github.com/PGlab-NIPER/MTB_Permeability.</p></div>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":null,"pages":null},"PeriodicalIF":3.9000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Diversity","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1007/s11030-024-10952-3","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
Tuberculosis (TB) caused by the bacteria Mycobacterium tuberculosis (M. tb), continues to pose a significant worldwide health threat. The advent of drug-resistant strains of the disease highlights the critical need for novel treatments. The unique cell wall of M. tb provides an extra layer of protection for the bacteria and hence only compounds that can penetrate this barrier can reach their targets within the bacterial cell wall. The creation of a reliable machine learning (ML) model to predict the mycobacterial cell wall permeability of small molecules is presented in this work and four ML algorithms, including Random Forest, Support Vector Machines (SVM), k-nearest Neighbour (k-NN) and Logistic Regression were trained on a dataset of 5368 compounds. RDKit and Mordred toolkits were used to calculate features. To determine the most effective model, various performance metrics were used such as accuracy, precision, recall, F1 score, and area under the receiver operating characteristic curve. The best-performing model was further refined with hyperparameter tuning and tenfold cross-validation. The SVM model with filtering outperformed the other machine learning models and demonstrated 80.26% and 81.13% accuracy on the test and validation datasets, respectively. The study also provided insights into the molecular descriptors that play the most important role in predicting the ability of a molecule to pass the M. tb cell wall, which could guide future compound design. The model is available at https://github.com/PGlab-NIPER/MTB_Permeability.
期刊介绍:
Molecular Diversity is a new publication forum for the rapid publication of refereed papers dedicated to describing the development, application and theory of molecular diversity and combinatorial chemistry in basic and applied research and drug discovery. The journal publishes both short and full papers, perspectives, news and reviews dealing with all aspects of the generation of molecular diversity, application of diversity for screening against alternative targets of all types (biological, biophysical, technological), analysis of results obtained and their application in various scientific disciplines/approaches including:
combinatorial chemistry and parallel synthesis;
small molecule libraries;
microwave synthesis;
flow synthesis;
fluorous synthesis;
diversity oriented synthesis (DOS);
nanoreactors;
click chemistry;
multiplex technologies;
fragment- and ligand-based design;
structure/function/SAR;
computational chemistry and molecular design;
chemoinformatics;
screening techniques and screening interfaces;
analytical and purification methods;
robotics, automation and miniaturization;
targeted libraries;
display libraries;
peptides and peptoids;
proteins;
oligonucleotides;
carbohydrates;
natural diversity;
new methods of library formulation and deconvolution;
directed evolution, origin of life and recombination;
search techniques, landscapes, random chemistry and more;