An interpretable surrogate model for H2S solubility forecasting in ionic liquids based on machine learning

IF 8.1 1区 工程技术 Q1 ENGINEERING, CHEMICAL
Yanjiang He, Ao Yang, Changjun Zou, Tianyou Fan, Qikui Lan, Yu He, Meng Wang, Jaka Sunarso, Zong Yang Kong
{"title":"An interpretable surrogate model for H2S solubility forecasting in ionic liquids based on machine learning","authors":"Yanjiang He, Ao Yang, Changjun Zou, Tianyou Fan, Qikui Lan, Yu He, Meng Wang, Jaka Sunarso, Zong Yang Kong","doi":"10.1016/j.seppur.2024.130061","DOIUrl":null,"url":null,"abstract":"Here we investigated four different ML-based models, i.e., gaussian process regression (GPR), extreme gradient boosting (i.e., XGBoost), random forest (RF), and support vector machine (SVM), for predicting the solubility of H<sub>2</sub>S in various ionic liquids (ILs). The dataset was divided into training and testing sets in an 80:20 ratio while the model performance for all models were evaluated using the coefficient of determination (R<sup>2</sup>), mean absolute error (MAE), and root mean square error (RMSE). Overall, all models effectively predicted H<sub>2</sub>S solubility, albeit with varying degrees of performance. The GPR provides the best performance, with R<sup>2</sup> of 0.9918, MAE of 0.0090, and RMSE of 0.0147. Following this is the XGBoost model with an R<sup>2</sup> value of 0.9827, MAE of 0.0155, and RMSE of 0.0213. The RF model displayed slightly lower performance, with an R<sup>2</sup> value of 0.9395, MAE of 0.0261, and RMSE of 0.0398 while the lowest performance was demonstrated by the SVM model, which gave an R<sup>2</sup> value of 0.9036, MAE of 0.0402, and RMSE of 0.0508. We used SHAP analysis, identified pressure, temperature, Estate_VSA3, Estate_VSA5, and MinEStateIndex as the top five dominant input features in our model interpretation. In a nutshell, this work presents new insights into the molecular characteristics that affect the solubility of H<sub>2</sub>S in ILs, paving future research path in this field.","PeriodicalId":427,"journal":{"name":"Separation and Purification Technology","volume":null,"pages":null},"PeriodicalIF":8.1000,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Separation and Purification Technology","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.seppur.2024.130061","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Here we investigated four different ML-based models, i.e., gaussian process regression (GPR), extreme gradient boosting (i.e., XGBoost), random forest (RF), and support vector machine (SVM), for predicting the solubility of H2S in various ionic liquids (ILs). The dataset was divided into training and testing sets in an 80:20 ratio while the model performance for all models were evaluated using the coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE). Overall, all models effectively predicted H2S solubility, albeit with varying degrees of performance. The GPR provides the best performance, with R2 of 0.9918, MAE of 0.0090, and RMSE of 0.0147. Following this is the XGBoost model with an R2 value of 0.9827, MAE of 0.0155, and RMSE of 0.0213. The RF model displayed slightly lower performance, with an R2 value of 0.9395, MAE of 0.0261, and RMSE of 0.0398 while the lowest performance was demonstrated by the SVM model, which gave an R2 value of 0.9036, MAE of 0.0402, and RMSE of 0.0508. We used SHAP analysis, identified pressure, temperature, Estate_VSA3, Estate_VSA5, and MinEStateIndex as the top five dominant input features in our model interpretation. In a nutshell, this work presents new insights into the molecular characteristics that affect the solubility of H2S in ILs, paving future research path in this field.
基于机器学习的离子液体中 H2S 溶解度预测的可解释代用模型
在此,我们研究了四种不同的基于 ML 的模型,即高斯过程回归(GPR)、极梯度提升(XGBoost)、随机森林(RF)和支持向量机(SVM),用于预测 H2S 在各种离子液体(IL)中的溶解度。数据集按 80:20 的比例分为训练集和测试集,所有模型的性能均通过判定系数(R2)、平均绝对误差(MAE)和均方根误差(RMSE)进行评估。总体而言,所有模型都能有效预测 H2S 的溶解度,但性能各不相同。GPR 性能最好,R2 为 0.9918,MAE 为 0.0090,RMSE 为 0.0147。紧随其后的是 XGBoost 模型,其 R2 值为 0.9827,MAE 为 0.0155,RMSE 为 0.0213。RF 模型的性能略低,R2 值为 0.9395,MAE 为 0.0261,RMSE 为 0.0398,而 SVM 模型的性能最低,R2 值为 0.9036,MAE 为 0.0402,RMSE 为 0.0508。通过 SHAP 分析,我们发现压力、温度、Estate_VSA3、Estate_VSA5 和 MinEStateIndex 是模型解释中最重要的五个输入特征。总之,这项工作为研究影响 H2S 在 ILs 中溶解度的分子特征提供了新的视角,为该领域未来的研究铺平了道路。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Separation and Purification Technology
Separation and Purification Technology 工程技术-工程:化工
CiteScore
14.00
自引率
12.80%
发文量
2347
审稿时长
43 days
期刊介绍: Separation and Purification Technology is a premier journal committed to sharing innovative methods for separation and purification in chemical and environmental engineering, encompassing both homogeneous solutions and heterogeneous mixtures. Our scope includes the separation and/or purification of liquids, vapors, and gases, as well as carbon capture and separation techniques. However, it's important to note that methods solely intended for analytical purposes are not within the scope of the journal. Additionally, disciplines such as soil science, polymer science, and metallurgy fall outside the purview of Separation and Purification Technology. Join us in advancing the field of separation and purification methods for sustainable solutions in chemical and environmental engineering.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信