Yaoyang Liu , Morug Salih Mahdi , Usama Kadem Radi , Ali Jihad , Ali Hamid AbdulHussein , Irshad Ahmad , Nasrin Mansuri , Mostafa Adnan Abdalrahman , Ahmed Alkhayyat , Ahmed Faisal
{"title":"Machine learning based modeling for estimation of drug solubility in supercritical fluid by adjusting important parameters","authors":"Yaoyang Liu , Morug Salih Mahdi , Usama Kadem Radi , Ali Jihad , Ali Hamid AbdulHussein , Irshad Ahmad , Nasrin Mansuri , Mostafa Adnan Abdalrahman , Ahmed Alkhayyat , Ahmed Faisal","doi":"10.1016/j.chemolab.2024.105241","DOIUrl":null,"url":null,"abstract":"<div><div>Here, we employed machine learning models to predict how well Capecitabine drug would dissolve in supercritical carbon dioxide as the green solvent. The vision is to investigate the drug suitability for processing of nanodrugs with enhanced bioavailability in the body. In the employed data set, P (pressure) and T (temperature) serve as inputs, and Y, the solubility, is the only output for building the models. This study uses DT (Decision Tree) and MLP (Multilayer perceptron) as the core models. However, the raw and individual form of conventional algorithms may not provide accurate and general results. Ensemble methods like boosting improve the model performance. Also, single and ensemble models mounted on these models have hyper-parameters whose optimization affects the final models. Meta-heuristic algorithms are popular for tuning hyper-parameters. In this research, we used a new hybrid framework by coupling the basic models with the Adaboost algorithm (as an ensemble method) and PO and CS algorithms (as optimizers) to obtain four different models. The MLP model boosted with Adaboost and tuned with PO algorithm showed the best fitting accuracy among all models. This model reduces the RMSE error rate to 1.71, MSE to 2.92, and MAE to 1.42.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"254 ","pages":"Article 105241"},"PeriodicalIF":3.7000,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemometrics and Intelligent Laboratory Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169743924001813","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Here, we employed machine learning models to predict how well Capecitabine drug would dissolve in supercritical carbon dioxide as the green solvent. The vision is to investigate the drug suitability for processing of nanodrugs with enhanced bioavailability in the body. In the employed data set, P (pressure) and T (temperature) serve as inputs, and Y, the solubility, is the only output for building the models. This study uses DT (Decision Tree) and MLP (Multilayer perceptron) as the core models. However, the raw and individual form of conventional algorithms may not provide accurate and general results. Ensemble methods like boosting improve the model performance. Also, single and ensemble models mounted on these models have hyper-parameters whose optimization affects the final models. Meta-heuristic algorithms are popular for tuning hyper-parameters. In this research, we used a new hybrid framework by coupling the basic models with the Adaboost algorithm (as an ensemble method) and PO and CS algorithms (as optimizers) to obtain four different models. The MLP model boosted with Adaboost and tuned with PO algorithm showed the best fitting accuracy among all models. This model reduces the RMSE error rate to 1.71, MSE to 2.92, and MAE to 1.42.
期刊介绍:
Chemometrics and Intelligent Laboratory Systems publishes original research papers, short communications, reviews, tutorials and Original Software Publications reporting on development of novel statistical, mathematical, or computer techniques in Chemistry and related disciplines.
Chemometrics is the chemical discipline that uses mathematical and statistical methods to design or select optimal procedures and experiments, and to provide maximum chemical information by analysing chemical data.
The journal deals with the following topics:
1) Development of new statistical, mathematical and chemometrical methods for Chemistry and related fields (Environmental Chemistry, Biochemistry, Toxicology, System Biology, -Omics, etc.)
2) Novel applications of chemometrics to all branches of Chemistry and related fields (typical domains of interest are: process data analysis, experimental design, data mining, signal processing, supervised modelling, decision making, robust statistics, mixture analysis, multivariate calibration etc.) Routine applications of established chemometrical techniques will not be considered.
3) Development of new software that provides novel tools or truly advances the use of chemometrical methods.
4) Well characterized data sets to test performance for the new methods and software.
The journal complies with International Committee of Medical Journal Editors'' Uniform requirements for manuscripts.