Jianghua Tian , Runqiu Dong , Hanbing Jia , Zhiyong Peng , Zhigang Liu , Le Wang , Lei Yi , Jialing Xu , Hui Jin , Bin Chen , Liejin Guo
{"title":"Interpretable machine learning for predicting and evaluating hydrogen production from supercritical water gasification of coal","authors":"Jianghua Tian , Runqiu Dong , Hanbing Jia , Zhiyong Peng , Zhigang Liu , Le Wang , Lei Yi , Jialing Xu , Hui Jin , Bin Chen , Liejin Guo","doi":"10.1016/j.fuel.2025.136173","DOIUrl":null,"url":null,"abstract":"<div><div>Coal Supercritical Water Gasification (SCWG) process optimization by Machine Learning (ML) models is a promising strategy to conserve experimental resources. However, the lack of diversity in ML models and the neglect of their interpretability in existing works may limit the development of coal SCWG technology. This paper systematically collected 233 experimental results (1631 data points) to develop five ML models to analyze coal SCWG: Support Vector Regression (SVR), AdaBoost Regression (ABR), Decision Tree (DT), Random Forest (RF) Regression and Gradient Boosting Regression (GBR). The DT and GBR were found to have more robust predictive ability among the five models due to their superior performance in Mean Square Error (MSE), coefficient of determination (R<sup>2</sup>) and Mean Absolute Error (MAE). Temperature (TEMP) and Residence Time (RT) are the main controlling factors in determining gas production by analyzing the results based on SHapley Additive exPlanations (SHAP) values. There is a significant positive correlation between TEMP and RT and gas production. The SHAP values of the GBR model can well interpret the mechanism of the influence of coal SCWG parameters, especially the Concentration (CR) is negatively correlated with the gasification yields of H<sub>2</sub>, CO, and CO<sub>2</sub>, while it is positively correlated with the gas yield of CH<sub>4</sub>. Combining with the model predictive ability (MSE of 0.54, R<sup>2</sup> of 0.97, MAE of 0.19) of the model and the interpretability of the mechanism, the GBR model may be a superior tool to assist the coal SCWG technology. The error analysis and catalyst were input into the GBR model as characteristic parameters to further enhance its robustness. Compared to the kinetic model, the GBR model improved the accuracy and generalization ability of the four-gas yield prediction by expanding the input parameters (TEMP, RT, CR, error, catalyst type and concentration). This work would be of great value in the prediction and optimization of the coal SCWG process.</div></div>","PeriodicalId":325,"journal":{"name":"Fuel","volume":"404 ","pages":"Article 136173"},"PeriodicalIF":7.5000,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fuel","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0016236125018988","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0
Abstract
Coal Supercritical Water Gasification (SCWG) process optimization by Machine Learning (ML) models is a promising strategy to conserve experimental resources. However, the lack of diversity in ML models and the neglect of their interpretability in existing works may limit the development of coal SCWG technology. This paper systematically collected 233 experimental results (1631 data points) to develop five ML models to analyze coal SCWG: Support Vector Regression (SVR), AdaBoost Regression (ABR), Decision Tree (DT), Random Forest (RF) Regression and Gradient Boosting Regression (GBR). The DT and GBR were found to have more robust predictive ability among the five models due to their superior performance in Mean Square Error (MSE), coefficient of determination (R2) and Mean Absolute Error (MAE). Temperature (TEMP) and Residence Time (RT) are the main controlling factors in determining gas production by analyzing the results based on SHapley Additive exPlanations (SHAP) values. There is a significant positive correlation between TEMP and RT and gas production. The SHAP values of the GBR model can well interpret the mechanism of the influence of coal SCWG parameters, especially the Concentration (CR) is negatively correlated with the gasification yields of H2, CO, and CO2, while it is positively correlated with the gas yield of CH4. Combining with the model predictive ability (MSE of 0.54, R2 of 0.97, MAE of 0.19) of the model and the interpretability of the mechanism, the GBR model may be a superior tool to assist the coal SCWG technology. The error analysis and catalyst were input into the GBR model as characteristic parameters to further enhance its robustness. Compared to the kinetic model, the GBR model improved the accuracy and generalization ability of the four-gas yield prediction by expanding the input parameters (TEMP, RT, CR, error, catalyst type and concentration). This work would be of great value in the prediction and optimization of the coal SCWG process.
期刊介绍:
The exploration of energy sources remains a critical matter of study. For the past nine decades, fuel has consistently held the forefront in primary research efforts within the field of energy science. This area of investigation encompasses a wide range of subjects, with a particular emphasis on emerging concerns like environmental factors and pollution.