High-fidelity prediction of drug solubility in supercritical CO₂ for pharmaceutical applications using advanced computational modeling

IF 4.7 3区 医学 Q1 PHARMACOLOGY & PHARMACY
Hashem O. Alsaab , Saeed Shirazian
{"title":"High-fidelity prediction of drug solubility in supercritical CO₂ for pharmaceutical applications using advanced computational modeling","authors":"Hashem O. Alsaab ,&nbsp;Saeed Shirazian","doi":"10.1016/j.ejps.2025.107321","DOIUrl":null,"url":null,"abstract":"<div><div>Accurately estimating the solubility of drugs in supercritical carbon dioxide (SC<img>CO₂) still represents a major difficulty in drug formulation, separation processes, and green technologies. Traditional empirical and semi-empirical methods usually have a hard time representing the complex non-linear interactions that determine solubility under different thermodynamic conditions (e.g., T and P), which, in turn, restricts their applicability and predictive consistency. This study presents an ensemble framework that combines three machine learning regressors, namely, Extreme Gradient Boosting Regression (XGBR), Light Gradient Boosting Regression (LGBR), and CatBoost Regression (CATr), facilitated by two bio-inspired optimization algorithms, the Artificial Protozoa Optimizer (APO) and the Hippopotamus Optimization Algorithm (HOA) for estimation of pharmaceutical solubility in supercritical CO<sub>2</sub>. A dataset of 110 experimental samples reflecting the temperature, pressure, molecular weight (MW), and melting point (MP) of four drugs (Rifampin, Sirolimus, Tacrolimus, and Teriflunomide) was used to model their solubility. Model robustness was ensured through k-fold cross-validation, and interpretability was assessed via SHAP and FAST sensitivity analysis. Additionally, prediction intervals were generated using bootstrapping, enhancing reliability for real-world applications. The ensemble (XGBR + LGBR + CATr optimized by HOA) achieved predictive accuracy (R² = 0.9920, RMSE = 0.08878). The results highlight the potential of optimized ensemble learning in capturing non-linear solubility behaviors, offering a reliable computational framework for pharmaceutical engineering and green drug processing.</div></div>","PeriodicalId":12018,"journal":{"name":"European Journal of Pharmaceutical Sciences","volume":"215 ","pages":"Article 107321"},"PeriodicalIF":4.7000,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Pharmaceutical Sciences","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0928098725003197","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
引用次数: 0

Abstract

Accurately estimating the solubility of drugs in supercritical carbon dioxide (SCCO₂) still represents a major difficulty in drug formulation, separation processes, and green technologies. Traditional empirical and semi-empirical methods usually have a hard time representing the complex non-linear interactions that determine solubility under different thermodynamic conditions (e.g., T and P), which, in turn, restricts their applicability and predictive consistency. This study presents an ensemble framework that combines three machine learning regressors, namely, Extreme Gradient Boosting Regression (XGBR), Light Gradient Boosting Regression (LGBR), and CatBoost Regression (CATr), facilitated by two bio-inspired optimization algorithms, the Artificial Protozoa Optimizer (APO) and the Hippopotamus Optimization Algorithm (HOA) for estimation of pharmaceutical solubility in supercritical CO2. A dataset of 110 experimental samples reflecting the temperature, pressure, molecular weight (MW), and melting point (MP) of four drugs (Rifampin, Sirolimus, Tacrolimus, and Teriflunomide) was used to model their solubility. Model robustness was ensured through k-fold cross-validation, and interpretability was assessed via SHAP and FAST sensitivity analysis. Additionally, prediction intervals were generated using bootstrapping, enhancing reliability for real-world applications. The ensemble (XGBR + LGBR + CATr optimized by HOA) achieved predictive accuracy (R² = 0.9920, RMSE = 0.08878). The results highlight the potential of optimized ensemble learning in capturing non-linear solubility behaviors, offering a reliable computational framework for pharmaceutical engineering and green drug processing.

Abstract Image

利用先进的计算模型高保真地预测药物在超临界CO₂中的溶解度。
准确估计药物在超临界二氧化碳(SC-CO₂)中的溶解度仍然是药物配方、分离过程和绿色技术的主要难点。传统的经验和半经验方法通常很难表示复杂的非线性相互作用,这些相互作用决定了不同热力学条件下(例如,T和P)的溶解度,这反过来又限制了它们的适用性和预测的一致性。本研究提出了一个集成框架,该框架结合了三种机器学习回归量,即极端梯度增强回归(XGBR),轻梯度增强回归(LGBR)和CatBoost回归(CATr),并由两种生物优化算法,人工原生动物优化器(APO)和河马优化算法(HOA)促进,用于估计超临界CO2中的药物溶解度。使用110个实验样品的数据集来反映四种药物(利福平、西罗莫司、他克莫司和特立氟米特)的温度、压力、分子量(MW)和熔点(MP),以模拟它们的溶解度。通过k-fold交叉验证确保模型稳健性,并通过SHAP和FAST敏感性分析评估可解释性。此外,使用自举生成预测区间,增强了实际应用的可靠性。经HOA优化的集合(XGBR + LGBR + CATr)达到了预测精度(R² = 0.9920,RMSE = 0.08878)。结果强调了优化集成学习在捕获非线性溶解度行为方面的潜力,为制药工程和绿色药物加工提供了可靠的计算框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
9.60
自引率
2.20%
发文量
248
审稿时长
50 days
期刊介绍: The journal publishes research articles, review articles and scientific commentaries on all aspects of the pharmaceutical sciences with emphasis on conceptual novelty and scientific quality. The Editors welcome articles in this multidisciplinary field, with a focus on topics relevant for drug discovery and development. More specifically, the Journal publishes reports on medicinal chemistry, pharmacology, drug absorption and metabolism, pharmacokinetics and pharmacodynamics, pharmaceutical and biomedical analysis, drug delivery (including gene delivery), drug targeting, pharmaceutical technology, pharmaceutical biotechnology and clinical drug evaluation. The journal will typically not give priority to manuscripts focusing primarily on organic synthesis, natural products, adaptation of analytical approaches, or discussions pertaining to drug policy making. Scientific commentaries and review articles are generally by invitation only or by consent of the Editors. Proceedings of scientific meetings may be published as special issues or supplements to the Journal.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信