{"title":"推进抗疟药物的发现:预测PfPK6抑制剂活性的集成机器学习模型。","authors":"Maryam Gholami, Mohammad Asadollahi-Baboli","doi":"10.1007/s11030-025-11203-9","DOIUrl":null,"url":null,"abstract":"<p><p>Malaria is a significant global health challenge, causing high morbidity and mortality. The rise of drug resistance highlights the urgent need for new antimalarial agents. This study focuses on predictive modeling of 104 Plasmodium falciparum protein kinase 6 (PfPK6) inhibitors, employing a range of machine learning techniques to develop ensemble regression and classification models. Molecular descriptors were refined using classification and regression trees (CART) to identify the most relevant features. Six machine learning algorithms (Random Forest (RF), Relevance Vector Machine (RVM), Support Vector Machine (SVM), Cubist, Artificial Neural Networks (ANN), and XGBoost) were utilized to construct regression models. The consensus model demonstrated superior predictive performance, achieving R<sup>2</sup><sub>Test</sub> = 0.94, SE<sub>Test</sub> = 0.20, Q<sup>2</sup><sub>CV</sub> = 0.90, and SE<sub>CV</sub> = 0.25, outperforming individual models. For classification tasks, five algorithms were evaluated and a majority voting approach yielded an accuracy of 91% and a sensitivity of 93%. The robustness of the models was confirmed through applicability domain analysis (96% coverage) and y-randomization tests, ensuring that the predictive outcomes were not due to chance correlations. This study highlights the effectiveness of ensemble machine learning approaches in predictive modeling and provides critical insights for the rational design of novel PfPK6 inhibitors.</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Advancing antimalarial drug discovery: ensemble machine learning models for predicting PfPK6 inhibitor activity.\",\"authors\":\"Maryam Gholami, Mohammad Asadollahi-Baboli\",\"doi\":\"10.1007/s11030-025-11203-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Malaria is a significant global health challenge, causing high morbidity and mortality. The rise of drug resistance highlights the urgent need for new antimalarial agents. This study focuses on predictive modeling of 104 Plasmodium falciparum protein kinase 6 (PfPK6) inhibitors, employing a range of machine learning techniques to develop ensemble regression and classification models. Molecular descriptors were refined using classification and regression trees (CART) to identify the most relevant features. Six machine learning algorithms (Random Forest (RF), Relevance Vector Machine (RVM), Support Vector Machine (SVM), Cubist, Artificial Neural Networks (ANN), and XGBoost) were utilized to construct regression models. The consensus model demonstrated superior predictive performance, achieving R<sup>2</sup><sub>Test</sub> = 0.94, SE<sub>Test</sub> = 0.20, Q<sup>2</sup><sub>CV</sub> = 0.90, and SE<sub>CV</sub> = 0.25, outperforming individual models. For classification tasks, five algorithms were evaluated and a majority voting approach yielded an accuracy of 91% and a sensitivity of 93%. The robustness of the models was confirmed through applicability domain analysis (96% coverage) and y-randomization tests, ensuring that the predictive outcomes were not due to chance correlations. This study highlights the effectiveness of ensemble machine learning approaches in predictive modeling and provides critical insights for the rational design of novel PfPK6 inhibitors.</p>\",\"PeriodicalId\":708,\"journal\":{\"name\":\"Molecular Diversity\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular Diversity\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1007/s11030-025-11203-9\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Diversity","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1007/s11030-025-11203-9","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, APPLIED","Score":null,"Total":0}
Advancing antimalarial drug discovery: ensemble machine learning models for predicting PfPK6 inhibitor activity.
Malaria is a significant global health challenge, causing high morbidity and mortality. The rise of drug resistance highlights the urgent need for new antimalarial agents. This study focuses on predictive modeling of 104 Plasmodium falciparum protein kinase 6 (PfPK6) inhibitors, employing a range of machine learning techniques to develop ensemble regression and classification models. Molecular descriptors were refined using classification and regression trees (CART) to identify the most relevant features. Six machine learning algorithms (Random Forest (RF), Relevance Vector Machine (RVM), Support Vector Machine (SVM), Cubist, Artificial Neural Networks (ANN), and XGBoost) were utilized to construct regression models. The consensus model demonstrated superior predictive performance, achieving R2Test = 0.94, SETest = 0.20, Q2CV = 0.90, and SECV = 0.25, outperforming individual models. For classification tasks, five algorithms were evaluated and a majority voting approach yielded an accuracy of 91% and a sensitivity of 93%. The robustness of the models was confirmed through applicability domain analysis (96% coverage) and y-randomization tests, ensuring that the predictive outcomes were not due to chance correlations. This study highlights the effectiveness of ensemble machine learning approaches in predictive modeling and provides critical insights for the rational design of novel PfPK6 inhibitors.
期刊介绍:
Molecular Diversity is a new publication forum for the rapid publication of refereed papers dedicated to describing the development, application and theory of molecular diversity and combinatorial chemistry in basic and applied research and drug discovery. The journal publishes both short and full papers, perspectives, news and reviews dealing with all aspects of the generation of molecular diversity, application of diversity for screening against alternative targets of all types (biological, biophysical, technological), analysis of results obtained and their application in various scientific disciplines/approaches including:
combinatorial chemistry and parallel synthesis;
small molecule libraries;
microwave synthesis;
flow synthesis;
fluorous synthesis;
diversity oriented synthesis (DOS);
nanoreactors;
click chemistry;
multiplex technologies;
fragment- and ligand-based design;
structure/function/SAR;
computational chemistry and molecular design;
chemoinformatics;
screening techniques and screening interfaces;
analytical and purification methods;
robotics, automation and miniaturization;
targeted libraries;
display libraries;
peptides and peptoids;
proteins;
oligonucleotides;
carbohydrates;
natural diversity;
new methods of library formulation and deconvolution;
directed evolution, origin of life and recombination;
search techniques, landscapes, random chemistry and more;