预测二氧化碳在不同离子液体中的溶解度:使用机器学习算法的数据驱动方法

IF 5.2 3区 工程技术 Q2 ENERGY & FUELS
Zahra Bastami, Mohammad Amin Sobati* and Mahdieh Amereh, 
{"title":"预测二氧化碳在不同离子液体中的溶解度:使用机器学习算法的数据驱动方法","authors":"Zahra Bastami,&nbsp;Mohammad Amin Sobati* and Mahdieh Amereh,&nbsp;","doi":"10.1021/acs.energyfuels.5c0134510.1021/acs.energyfuels.5c01345","DOIUrl":null,"url":null,"abstract":"<p >In this study, new machine-learning-based models have been developed for the prediction of carbon dioxide (CO<sub>2</sub>) solubility in different Ionic Liquids (ILs). An extensive data set comprising 16,480 experimental data points of CO<sub>2</sub> solubility in 296 ILs, consisting of 103 different cation and 78 different anion structures, was utilized for this purpose. Quantitative Structure–Property Relationship (QSPR) models were developed using linear and nonlinear methods based on this large data set. To consider the effect of cation and anion structures on the CO<sub>2</sub> solubility, basic descriptors, including zero-dimensional, one-dimensional, and fingerprint descriptors (a category of two-dimensional descriptors), were calculated. Subsequently, the most relevant variables were identified through the StepWise Regression (SWR), resulting in the selection of 18 categories of cationic and anionic descriptors, in addition to temperature and pressure, as inputs for nonlinear Machine Learning (ML) models such as MultiLayer Perceptron (MLP), Radial Basis Function (RBF), Random Forest (RF), and Least-Squares Boosting (LSBoost). Internal and external validation of the models indicated that the LSBoost model displayed the highest accuracy in predicting CO<sub>2</sub> solubility and demonstrated superior capability in modeling complex data. <i>R</i><sup>2</sup> and MSE values for this model were 0.9962 and 0.0070 for the training set and 0.9243 and 0.1277 for the test set, respectively. Furthermore, comparisons between the LSBoost model and the available models in the literature demonstrated that the LSBoost model surpasses the other models in performance, proving to be reliable for predicting CO<sub>2</sub> solubility in new ILs, thereby aiding in the design and selection of ILs for CO<sub>2</sub> capture.</p>","PeriodicalId":35,"journal":{"name":"Energy & Fuels","volume":"39 23","pages":"11256–11278 11256–11278"},"PeriodicalIF":5.2000,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Predicting CO2 Solubility in Diverse Ionic Liquids: A Data-Driven Approach Using Machine Learning Algorithms\",\"authors\":\"Zahra Bastami,&nbsp;Mohammad Amin Sobati* and Mahdieh Amereh,&nbsp;\",\"doi\":\"10.1021/acs.energyfuels.5c0134510.1021/acs.energyfuels.5c01345\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >In this study, new machine-learning-based models have been developed for the prediction of carbon dioxide (CO<sub>2</sub>) solubility in different Ionic Liquids (ILs). An extensive data set comprising 16,480 experimental data points of CO<sub>2</sub> solubility in 296 ILs, consisting of 103 different cation and 78 different anion structures, was utilized for this purpose. Quantitative Structure–Property Relationship (QSPR) models were developed using linear and nonlinear methods based on this large data set. To consider the effect of cation and anion structures on the CO<sub>2</sub> solubility, basic descriptors, including zero-dimensional, one-dimensional, and fingerprint descriptors (a category of two-dimensional descriptors), were calculated. Subsequently, the most relevant variables were identified through the StepWise Regression (SWR), resulting in the selection of 18 categories of cationic and anionic descriptors, in addition to temperature and pressure, as inputs for nonlinear Machine Learning (ML) models such as MultiLayer Perceptron (MLP), Radial Basis Function (RBF), Random Forest (RF), and Least-Squares Boosting (LSBoost). Internal and external validation of the models indicated that the LSBoost model displayed the highest accuracy in predicting CO<sub>2</sub> solubility and demonstrated superior capability in modeling complex data. <i>R</i><sup>2</sup> and MSE values for this model were 0.9962 and 0.0070 for the training set and 0.9243 and 0.1277 for the test set, respectively. Furthermore, comparisons between the LSBoost model and the available models in the literature demonstrated that the LSBoost model surpasses the other models in performance, proving to be reliable for predicting CO<sub>2</sub> solubility in new ILs, thereby aiding in the design and selection of ILs for CO<sub>2</sub> capture.</p>\",\"PeriodicalId\":35,\"journal\":{\"name\":\"Energy & Fuels\",\"volume\":\"39 23\",\"pages\":\"11256–11278 11256–11278\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2025-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Energy & Fuels\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acs.energyfuels.5c01345\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENERGY & FUELS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy & Fuels","FirstCategoryId":"5","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.energyfuels.5c01345","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0

摘要

在这项研究中,已经开发了新的基于机器学习的模型,用于预测二氧化碳(CO2)在不同离子液体(ILs)中的溶解度。为了达到这个目的,我们使用了一个广泛的数据集,包括16480个实验数据点,这些数据点是由103种不同的阳离子和78种不同的阴离子结构组成的296种离子中的CO2溶解度。在此基础上,采用线性和非线性方法建立了定量结构-属性关系(QSPR)模型。为了考虑阳离子和阴离子结构对CO2溶解度的影响,计算了基本描述符,包括零维描述符、一维描述符和指纹描述符(一类二维描述符)。随后,通过逐步回归(SWR)确定最相关的变量,除了温度和压力外,还选择了18类阳离子和阴离子描述符,作为非线性机器学习(ML)模型的输入,如多层感知器(MLP)、径向基函数(RBF)、随机森林(RF)和最小二乘增强(LSBoost)。模型的内部和外部验证表明,LSBoost模型在预测CO2溶解度方面具有最高的准确性,并且在模拟复杂数据方面表现出优越的能力。该模型的训练集R2和MSE分别为0.9962和0.0070,测试集R2和MSE分别为0.9243和0.1277。此外,LSBoost模型与文献中现有模型的比较表明,LSBoost模型在性能上优于其他模型,证明了LSBoost模型在预测CO2在新il中的溶解度方面是可靠的,从而有助于设计和选择用于CO2捕集的il。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Predicting CO2 Solubility in Diverse Ionic Liquids: A Data-Driven Approach Using Machine Learning Algorithms

In this study, new machine-learning-based models have been developed for the prediction of carbon dioxide (CO2) solubility in different Ionic Liquids (ILs). An extensive data set comprising 16,480 experimental data points of CO2 solubility in 296 ILs, consisting of 103 different cation and 78 different anion structures, was utilized for this purpose. Quantitative Structure–Property Relationship (QSPR) models were developed using linear and nonlinear methods based on this large data set. To consider the effect of cation and anion structures on the CO2 solubility, basic descriptors, including zero-dimensional, one-dimensional, and fingerprint descriptors (a category of two-dimensional descriptors), were calculated. Subsequently, the most relevant variables were identified through the StepWise Regression (SWR), resulting in the selection of 18 categories of cationic and anionic descriptors, in addition to temperature and pressure, as inputs for nonlinear Machine Learning (ML) models such as MultiLayer Perceptron (MLP), Radial Basis Function (RBF), Random Forest (RF), and Least-Squares Boosting (LSBoost). Internal and external validation of the models indicated that the LSBoost model displayed the highest accuracy in predicting CO2 solubility and demonstrated superior capability in modeling complex data. R2 and MSE values for this model were 0.9962 and 0.0070 for the training set and 0.9243 and 0.1277 for the test set, respectively. Furthermore, comparisons between the LSBoost model and the available models in the literature demonstrated that the LSBoost model surpasses the other models in performance, proving to be reliable for predicting CO2 solubility in new ILs, thereby aiding in the design and selection of ILs for CO2 capture.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Energy & Fuels
Energy & Fuels 工程技术-工程:化工
CiteScore
9.20
自引率
13.20%
发文量
1101
审稿时长
2.1 months
期刊介绍: Energy & Fuels publishes reports of research in the technical area defined by the intersection of the disciplines of chemistry and chemical engineering and the application domain of non-nuclear energy and fuels. This includes research directed at the formation of, exploration for, and production of fossil fuels and biomass; the properties and structure or molecular composition of both raw fuels and refined products; the chemistry involved in the processing and utilization of fuels; fuel cells and their applications; and the analytical and instrumental techniques used in investigations of the foregoing areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信