Ruan M. Carvalho, Iago G. L. Rosa, Diego E. B. Gomes, Priscila V. Z. C. Goliatt, Leonardo Goliatt
{"title":"环糊精主客体结合预测的高斯过程回归","authors":"Ruan M. Carvalho, Iago G. L. Rosa, Diego E. B. Gomes, Priscila V. Z. C. Goliatt, Leonardo Goliatt","doi":"10.1007/s10847-021-01092-4","DOIUrl":null,"url":null,"abstract":"<div><p>Machine Learning (ML) techniques are becoming an integral part of rational drug design and discovery. Data-driven modeling regularly outperforms physics-based models for predicting molecular binding affinities, placing ML as a promising tool. Cyclodextrins are nano-cages used to improve the delivery of insoluble or toxic drugs. Due to chemical similarity to proteins, ML approaches could vastly profit to improve affinity prediction and enhance their carriable drug portfolio. Here we evaluate the performance of three well-known ML methods—Support Vector Regression (SVR), Gaussian Process Regression (GPR), and eXtreme Gradient Boosting (XGB)—to predict the binding affinity of cyclodextrin and known ligands. We perform hyperparameter tuning through Random Search. The results were compatible with the presented literature. We increased our previous prediction performance and present a GPR model to adjust to the data (<span>\\(R^2\\)</span> = 0.803) with low prediction errors (RMSE = 1.811 kJ/mol and MAE = 1.201 kJ/mol).</p></div>","PeriodicalId":638,"journal":{"name":"Journal of Inclusion Phenomena and Macrocyclic Chemistry","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2021-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s10847-021-01092-4","citationCount":"5","resultStr":"{\"title\":\"Gaussian processes regression for cyclodextrin host-guest binding prediction\",\"authors\":\"Ruan M. Carvalho, Iago G. L. Rosa, Diego E. B. Gomes, Priscila V. Z. C. Goliatt, Leonardo Goliatt\",\"doi\":\"10.1007/s10847-021-01092-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Machine Learning (ML) techniques are becoming an integral part of rational drug design and discovery. Data-driven modeling regularly outperforms physics-based models for predicting molecular binding affinities, placing ML as a promising tool. Cyclodextrins are nano-cages used to improve the delivery of insoluble or toxic drugs. Due to chemical similarity to proteins, ML approaches could vastly profit to improve affinity prediction and enhance their carriable drug portfolio. Here we evaluate the performance of three well-known ML methods—Support Vector Regression (SVR), Gaussian Process Regression (GPR), and eXtreme Gradient Boosting (XGB)—to predict the binding affinity of cyclodextrin and known ligands. We perform hyperparameter tuning through Random Search. The results were compatible with the presented literature. We increased our previous prediction performance and present a GPR model to adjust to the data (<span>\\\\(R^2\\\\)</span> = 0.803) with low prediction errors (RMSE = 1.811 kJ/mol and MAE = 1.201 kJ/mol).</p></div>\",\"PeriodicalId\":638,\"journal\":{\"name\":\"Journal of Inclusion Phenomena and Macrocyclic Chemistry\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2021-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1007/s10847-021-01092-4\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Inclusion Phenomena and Macrocyclic Chemistry\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10847-021-01092-4\",\"RegionNum\":4,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Agricultural and Biological Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Inclusion Phenomena and Macrocyclic Chemistry","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1007/s10847-021-01092-4","RegionNum":4,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}
引用次数: 5
摘要
机器学习(ML)技术正在成为合理药物设计和发现的一个组成部分。数据驱动的建模在预测分子结合亲和力方面通常优于基于物理的模型,使ML成为一个有前途的工具。环糊精是一种纳米笼,用于改善不溶性或有毒药物的递送。由于与蛋白质的化学相似性,机器学习方法可以极大地改善亲和力预测和增强其携带药物组合。在这里,我们评估了三种著名的ML方法——支持向量回归(SVR)、高斯过程回归(GPR)和极限梯度增强(XGB)——预测环糊精与已知配体的结合亲和力的性能。我们通过随机搜索执行超参数调优。结果与文献一致。我们提高了之前的预测性能,提出了一个GPR模型来调整数据(\(R^2\) = 0.803),预测误差低(RMSE = 1.811 kJ/mol, MAE = 1.201 kJ/mol)。
Gaussian processes regression for cyclodextrin host-guest binding prediction
Machine Learning (ML) techniques are becoming an integral part of rational drug design and discovery. Data-driven modeling regularly outperforms physics-based models for predicting molecular binding affinities, placing ML as a promising tool. Cyclodextrins are nano-cages used to improve the delivery of insoluble or toxic drugs. Due to chemical similarity to proteins, ML approaches could vastly profit to improve affinity prediction and enhance their carriable drug portfolio. Here we evaluate the performance of three well-known ML methods—Support Vector Regression (SVR), Gaussian Process Regression (GPR), and eXtreme Gradient Boosting (XGB)—to predict the binding affinity of cyclodextrin and known ligands. We perform hyperparameter tuning through Random Search. The results were compatible with the presented literature. We increased our previous prediction performance and present a GPR model to adjust to the data (\(R^2\) = 0.803) with low prediction errors (RMSE = 1.811 kJ/mol and MAE = 1.201 kJ/mol).
期刊介绍:
The Journal of Inclusion Phenomena and Macrocyclic Chemistry is the premier interdisciplinary publication reporting on original research into all aspects of host-guest systems. Examples of specific areas of interest are: the preparation and characterization of new hosts and new host-guest systems, especially those involving macrocyclic ligands; crystallographic, spectroscopic, thermodynamic and theoretical studies; applications in chromatography and inclusion polymerization; enzyme modelling; molecular recognition and catalysis by inclusion compounds; intercalates in biological and non-biological systems, cyclodextrin complexes and their applications in the agriculture, flavoring, food and pharmaceutical industries; synthesis, characterization and applications of zeolites.
The journal publishes primarily reports of original research and preliminary communications, provided the latter represent a significant advance in the understanding of inclusion science. Critical reviews dealing with recent advances in the field are a periodic feature of the journal.