Anupama Binoy, Ratul Bhowmik, Preena S. Parvathy, C. Gopi Mohan
{"title":"关键ABC转运体麦芽糖结合蛋白A的机器学习模型和基于结构的抗菌药物发现","authors":"Anupama Binoy, Ratul Bhowmik, Preena S. Parvathy, C. Gopi Mohan","doi":"10.1002/jcb.70049","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Generating new and efficient drugs through machine learning-assisted quantitative structure–activity relationships (ML-QSAR) has become a promising strategy for addressing multidrug-resistant gram-negative bacterial infections. We developed robust ML-QSAR models using Genetic Function Approximation (GFA), Support Vector Machine (SVM), and Artificial Neural Network (ANN) methods to predict the activity of experimentally known quinoline-based MsbA inhibitors, aiming to create more effective antibacterial drugs. The ML models were built using eight significant molecular descriptors: B09[N-Cl], CATS3D_04_AA, F06[N-O], G2i, molecular weight (MW), Mor04p, VE1sign_B(s), and VE1sign_Dz(i), along with 279 molecular fingerprints to predict the MsbA inhibition activity of quinoline-based compounds. The molecular descriptor-based SVM model achieved an R² of 0.9891 and a q² cross-validation correlation of 0.9388. In contrast, the molecular fingerprint-based SVM model had an R² of 0.9981 and a q² cross-validation correlation of 0.7584, making it the best-performing model. The robustness of these developed models was further validated through various internal, external, and applicability domain analyses. The most active compounds identified in this data set, compounds 31 and 40, were subsequently used to generate 62 new quinoline-based compounds. Additionally, three modelled quinoline-based inhibitors, M28, N7, and N23, demonstrated excellent bioactivity, binding affinity, and pharmacokinetic profiles. To support further research, the fingerprint-based ML-QSAR model is available as a web application, MsbA-Pred (https://msba-mohan-amrita.streamlit.app/), which allows users to predict MsbA inhibitory activity from any device.</p>\n </div>","PeriodicalId":15219,"journal":{"name":"Journal of cellular biochemistry","volume":"126 6","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2025-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine Learning Models and Structure-Based Antibacterial Drug Discovery of the Key ABC Transporter Maltose-Binding Protein A\",\"authors\":\"Anupama Binoy, Ratul Bhowmik, Preena S. Parvathy, C. Gopi Mohan\",\"doi\":\"10.1002/jcb.70049\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Generating new and efficient drugs through machine learning-assisted quantitative structure–activity relationships (ML-QSAR) has become a promising strategy for addressing multidrug-resistant gram-negative bacterial infections. We developed robust ML-QSAR models using Genetic Function Approximation (GFA), Support Vector Machine (SVM), and Artificial Neural Network (ANN) methods to predict the activity of experimentally known quinoline-based MsbA inhibitors, aiming to create more effective antibacterial drugs. The ML models were built using eight significant molecular descriptors: B09[N-Cl], CATS3D_04_AA, F06[N-O], G2i, molecular weight (MW), Mor04p, VE1sign_B(s), and VE1sign_Dz(i), along with 279 molecular fingerprints to predict the MsbA inhibition activity of quinoline-based compounds. The molecular descriptor-based SVM model achieved an R² of 0.9891 and a q² cross-validation correlation of 0.9388. In contrast, the molecular fingerprint-based SVM model had an R² of 0.9981 and a q² cross-validation correlation of 0.7584, making it the best-performing model. The robustness of these developed models was further validated through various internal, external, and applicability domain analyses. The most active compounds identified in this data set, compounds 31 and 40, were subsequently used to generate 62 new quinoline-based compounds. Additionally, three modelled quinoline-based inhibitors, M28, N7, and N23, demonstrated excellent bioactivity, binding affinity, and pharmacokinetic profiles. To support further research, the fingerprint-based ML-QSAR model is available as a web application, MsbA-Pred (https://msba-mohan-amrita.streamlit.app/), which allows users to predict MsbA inhibitory activity from any device.</p>\\n </div>\",\"PeriodicalId\":15219,\"journal\":{\"name\":\"Journal of cellular biochemistry\",\"volume\":\"126 6\",\"pages\":\"\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-06-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of cellular biochemistry\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/jcb.70049\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of cellular biochemistry","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jcb.70049","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
通过机器学习辅助的定量构效关系(ML-QSAR)生成新的高效药物已成为解决多重耐药革兰氏阴性细菌感染的一种有前途的策略。我们利用遗传函数逼近(GFA)、支持向量机(SVM)和人工神经网络(ANN)方法建立了鲁棒的ML-QSAR模型,以预测实验中已知的喹啉类MsbA抑制剂的活性,旨在开发更有效的抗菌药物。利用B09[N-Cl]、CATS3D_04_AA、F06[N-O]、G2i、分子量(MW)、Mor04p、VE1sign_B(s)、VE1sign_Dz(i)等8个重要分子描述符和279个分子指纹图谱建立ML模型,预测喹啉类化合物对MsbA的抑制活性。基于分子描述符的SVM模型的交叉验证相关系数R²为0.9891,q²为0.9388。相比之下,基于分子指纹的SVM模型的R²为0.9981,q²交叉验证相关系数为0.7584,是表现最好的模型。通过各种内部、外部和适用性领域分析,进一步验证了这些开发模型的鲁棒性。在该数据集中发现的活性最高的化合物,化合物31和40,随后被用来生成62个新的喹啉类化合物。此外,三种基于喹啉的模拟抑制剂M28、N7和N23表现出优异的生物活性、结合亲和力和药代动力学特征。为了支持进一步的研究,基于指纹的ML-QSAR模型可作为web应用程序MsbA- pred (https://msba-mohan-amrita.streamlit.app/)提供,该应用程序允许用户从任何设备预测MsbA抑制活性。
Machine Learning Models and Structure-Based Antibacterial Drug Discovery of the Key ABC Transporter Maltose-Binding Protein A
Generating new and efficient drugs through machine learning-assisted quantitative structure–activity relationships (ML-QSAR) has become a promising strategy for addressing multidrug-resistant gram-negative bacterial infections. We developed robust ML-QSAR models using Genetic Function Approximation (GFA), Support Vector Machine (SVM), and Artificial Neural Network (ANN) methods to predict the activity of experimentally known quinoline-based MsbA inhibitors, aiming to create more effective antibacterial drugs. The ML models were built using eight significant molecular descriptors: B09[N-Cl], CATS3D_04_AA, F06[N-O], G2i, molecular weight (MW), Mor04p, VE1sign_B(s), and VE1sign_Dz(i), along with 279 molecular fingerprints to predict the MsbA inhibition activity of quinoline-based compounds. The molecular descriptor-based SVM model achieved an R² of 0.9891 and a q² cross-validation correlation of 0.9388. In contrast, the molecular fingerprint-based SVM model had an R² of 0.9981 and a q² cross-validation correlation of 0.7584, making it the best-performing model. The robustness of these developed models was further validated through various internal, external, and applicability domain analyses. The most active compounds identified in this data set, compounds 31 and 40, were subsequently used to generate 62 new quinoline-based compounds. Additionally, three modelled quinoline-based inhibitors, M28, N7, and N23, demonstrated excellent bioactivity, binding affinity, and pharmacokinetic profiles. To support further research, the fingerprint-based ML-QSAR model is available as a web application, MsbA-Pred (https://msba-mohan-amrita.streamlit.app/), which allows users to predict MsbA inhibitory activity from any device.
期刊介绍:
The Journal of Cellular Biochemistry publishes descriptions of original research in which complex cellular, pathogenic, clinical, or animal model systems are studied by biochemical, molecular, genetic, epigenetic or quantitative ultrastructural approaches. Submission of papers reporting genomic, proteomic, bioinformatics and systems biology approaches to identify and characterize parameters of biological control in a cellular context are encouraged. The areas covered include, but are not restricted to, conditions, agents, regulatory networks, or differentiation states that influence structure, cell cycle & growth control, structure-function relationships.