{"title":"Advancing CO2 solubility prediction in aqueous solutions: A machine learning approach for CCUS application","authors":"Gideon Gyamfi, Xiaoli Li","doi":"10.1016/j.geoen.2025.214175","DOIUrl":null,"url":null,"abstract":"<div><div>Accurately predicting CO<sub>2</sub> solubility in aqueous solution (pure water and brines) is essential for optimizing carbon capture, utilization, and storage (CCUS) processes. In this study, the experimental dataset consists of various salts, specifically sodium chloride (NaCl), calcium chloride (CaCl<sub>2</sub>), magnesium chloride (MgCl<sub>2</sub>), potassium chloride (KCl), sodium bicarbonate (NaHCO<sub>3</sub>), sodium sulfate (Na<sub>2</sub>SO<sub>4</sub>), potassium carbonate (K<sub>2</sub>CO<sub>3</sub>), and magnesium sulfate (MgSO<sub>4</sub>). The comprehensive dataset encompasses a range of pressures (0.1–50 MPa), temperatures (274–453 K), and salinity levels (0–15 mol/kg). The objective is to develop a robust predictive model for CO<sub>2</sub> solubility utilizing advanced machine learning methodologies, specifically Random Forest (RF), Gradient Boosting (GB), and an Ensemble algorithm. Data preprocessing entails standardization, outlier elimination, and the conversion of salinity to ionic strength via the Debye-Hückel method. Additionally, hyperparameter optimization and cross-validation are employed to enhance the robustness of the model and mitigate overfitting. Among the implemented models, the Ensemble model exhibits the best performance, statistically, achieving R-square values of 0.9916, 0.9832, and 0.9934 and mean squared error values of 0.0078, 0.0122, and 0.0056 for training, validation, and testing datasets respectively. Sensitivity analyses of feature importance indicate that pressure is the predominant factor influencing CO<sub>2</sub> solubility, followed closely by ionic strength and temperature. Furthermore, the study identifies potassium carbonate (K<sub>2</sub>CO<sub>3</sub>) as exhibiting a notably high affinity for CO<sub>2</sub>, especially at a temperature of 353 K. Visualizing predictive trends in CO<sub>2</sub> solubility across varying concentrations of ionic strength, temperature, and pressure substantiates the models’ capacity to accurately capture the intricate interactions among these parameters. These results provide a robust and accurate framework for predicting CO<sub>2</sub> solubility, advancing CCUS strategies, and enhancing understanding of CO<sub>2</sub> behavior in brine systems under diverse conditions.</div></div>","PeriodicalId":100578,"journal":{"name":"Geoenergy Science and Engineering","volume":"257 ","pages":"Article 214175"},"PeriodicalIF":4.6000,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geoenergy Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949891025005330","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0
Abstract
Accurately predicting CO2 solubility in aqueous solution (pure water and brines) is essential for optimizing carbon capture, utilization, and storage (CCUS) processes. In this study, the experimental dataset consists of various salts, specifically sodium chloride (NaCl), calcium chloride (CaCl2), magnesium chloride (MgCl2), potassium chloride (KCl), sodium bicarbonate (NaHCO3), sodium sulfate (Na2SO4), potassium carbonate (K2CO3), and magnesium sulfate (MgSO4). The comprehensive dataset encompasses a range of pressures (0.1–50 MPa), temperatures (274–453 K), and salinity levels (0–15 mol/kg). The objective is to develop a robust predictive model for CO2 solubility utilizing advanced machine learning methodologies, specifically Random Forest (RF), Gradient Boosting (GB), and an Ensemble algorithm. Data preprocessing entails standardization, outlier elimination, and the conversion of salinity to ionic strength via the Debye-Hückel method. Additionally, hyperparameter optimization and cross-validation are employed to enhance the robustness of the model and mitigate overfitting. Among the implemented models, the Ensemble model exhibits the best performance, statistically, achieving R-square values of 0.9916, 0.9832, and 0.9934 and mean squared error values of 0.0078, 0.0122, and 0.0056 for training, validation, and testing datasets respectively. Sensitivity analyses of feature importance indicate that pressure is the predominant factor influencing CO2 solubility, followed closely by ionic strength and temperature. Furthermore, the study identifies potassium carbonate (K2CO3) as exhibiting a notably high affinity for CO2, especially at a temperature of 353 K. Visualizing predictive trends in CO2 solubility across varying concentrations of ionic strength, temperature, and pressure substantiates the models’ capacity to accurately capture the intricate interactions among these parameters. These results provide a robust and accurate framework for predicting CO2 solubility, advancing CCUS strategies, and enhancing understanding of CO2 behavior in brine systems under diverse conditions.