{"title":"Machine learning with new functional structure descriptors for design and screening of ionic liquids in CO2 efficient capture","authors":"Ranran Geng, Wenjuan Deng, Zhiqiang Hu, JianLei Wang, Yuanyuan Zhao, Baichaun Zhou, Guocai Tian","doi":"10.1039/d5cp01972a","DOIUrl":null,"url":null,"abstract":"Carbon dioxide emission reduction, conversion and utilization are the hot and difficult issues in the world. As a new kind of green solvents, ionic liquids (ILs) are widely used in CO2 capture and conversion, but there are various kinds of ILs (more than 1018). How to select and screen the appropriate ILs for CO2 capture is an urgent problem to be solved. Therefore, it is of great significance to establish the Quantitative Structure-Property Relationships (QSPR) of ILs for CO2 capture. From the practical point of view of ILs design and synthesis, a new functional structure descriptor (FSD) based on group contribution method (GC) was constructed. At the same time, the idea of increasing dimension to increase accuracy in traditional machine learning is changed, and the feasibility of reducing the dimension under the condition of ensuring accuracy is examined. A dimensionless molecular descriptor CORE is constructed. Based on these two new molecular descriptors, we discussed the performance of six common ensemble learning models (CatBoost, LightGBM, XGBoost, GBDT, RF and AdaBoost) for CO2 solubility in ILs. It is shown that all ensemble learning models can achieve good performance, but CatBoost model is the most outstanding. The R² of 0.9945 and MAE of 0.0108 for CatBoost-FSD model is achieved, while R² and MAE is 0.9925 and 0.0120 for CatBoost-CORE model, respectively. The interpretability of CatBoost-CORE model is analyzed, and the key features are determined. Based on the CORE descriptor, the best experimental conditions are obtained, and nine kinds of ILs with superior performance are recommended.","PeriodicalId":99,"journal":{"name":"Physical Chemistry Chemical Physics","volume":"14 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physical Chemistry Chemical Physics","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1039/d5cp01972a","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Carbon dioxide emission reduction, conversion and utilization are the hot and difficult issues in the world. As a new kind of green solvents, ionic liquids (ILs) are widely used in CO2 capture and conversion, but there are various kinds of ILs (more than 1018). How to select and screen the appropriate ILs for CO2 capture is an urgent problem to be solved. Therefore, it is of great significance to establish the Quantitative Structure-Property Relationships (QSPR) of ILs for CO2 capture. From the practical point of view of ILs design and synthesis, a new functional structure descriptor (FSD) based on group contribution method (GC) was constructed. At the same time, the idea of increasing dimension to increase accuracy in traditional machine learning is changed, and the feasibility of reducing the dimension under the condition of ensuring accuracy is examined. A dimensionless molecular descriptor CORE is constructed. Based on these two new molecular descriptors, we discussed the performance of six common ensemble learning models (CatBoost, LightGBM, XGBoost, GBDT, RF and AdaBoost) for CO2 solubility in ILs. It is shown that all ensemble learning models can achieve good performance, but CatBoost model is the most outstanding. The R² of 0.9945 and MAE of 0.0108 for CatBoost-FSD model is achieved, while R² and MAE is 0.9925 and 0.0120 for CatBoost-CORE model, respectively. The interpretability of CatBoost-CORE model is analyzed, and the key features are determined. Based on the CORE descriptor, the best experimental conditions are obtained, and nine kinds of ILs with superior performance are recommended.
期刊介绍:
Physical Chemistry Chemical Physics (PCCP) is an international journal co-owned by 19 physical chemistry and physics societies from around the world. This journal publishes original, cutting-edge research in physical chemistry, chemical physics and biophysical chemistry. To be suitable for publication in PCCP, articles must include significant innovation and/or insight into physical chemistry; this is the most important criterion that reviewers and Editors will judge against when evaluating submissions.
The journal has a broad scope and welcomes contributions spanning experiment, theory, computation and data science. Topical coverage includes spectroscopy, dynamics, kinetics, statistical mechanics, thermodynamics, electrochemistry, catalysis, surface science, quantum mechanics, quantum computing and machine learning. Interdisciplinary research areas such as polymers and soft matter, materials, nanoscience, energy, surfaces/interfaces, and biophysical chemistry are welcomed if they demonstrate significant innovation and/or insight into physical chemistry. Joined experimental/theoretical studies are particularly appreciated when complementary and based on up-to-date approaches.