Interpretable machine learning model for bio-oil property prediction from hydrothermal liquefaction of biomass via graph neural networks-based molecular structure
{"title":"Interpretable machine learning model for bio-oil property prediction from hydrothermal liquefaction of biomass via graph neural networks-based molecular structure","authors":"Yue Tian , Junghwan Kim , Yi Liu","doi":"10.1016/j.energy.2025.136708","DOIUrl":null,"url":null,"abstract":"<div><div>Hydrothermal bio-oil is a promising fuel source to effectively alleviate the energy crisis and contribute to carbon neutrality. However, the application performance of hydrothermal bio-oil is primarily affected by its yield, compositions (C, H, O, N, and S contents), and higher heating value (HHV). Hence, we proposed an interpretable machine learning model for hydrothermal bio-oil property prediction. Graph neural network (GNN)-based molecular structure was utilized to enhance datasets. Single- and multi-task prediction models were established by extreme gradient boosting (XGB), random forest (RF), gradient boosting decision tree (GBDT), and deep neural network (DNN). Shapely values and partial dependence were utilized to interpret the impact of input features on output responses. Results indicated that XGB model (average train R<sup>2</sup> = 0.95, RMSE = 1.29, MAE = 0.69, average test R<sup>2</sup> = 0.91, RMSE = 1.30, MAE = 0.93) performed the most optimally after applying GNN, with an average performance improvement of 6.74–7.95 %. The model demonstrated outstanding performance on unknown data, achieving an average R<sup>2</sup> of 0.916. Temperature and biomass ultimate analysis emerged as pivotal features influencing output. Triglycerides exhibited a stronger influence than fatty acids owing to their higher carbon content. A combination of high temperature (>350 °C) and elevated triglyceride content (>30 %) increased bio-oil yield (∼40 wt%) and HHV (∼35 MJ/kg).</div></div>","PeriodicalId":11647,"journal":{"name":"Energy","volume":"329 ","pages":"Article 136708"},"PeriodicalIF":9.0000,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0360544225023503","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0
Abstract
Hydrothermal bio-oil is a promising fuel source to effectively alleviate the energy crisis and contribute to carbon neutrality. However, the application performance of hydrothermal bio-oil is primarily affected by its yield, compositions (C, H, O, N, and S contents), and higher heating value (HHV). Hence, we proposed an interpretable machine learning model for hydrothermal bio-oil property prediction. Graph neural network (GNN)-based molecular structure was utilized to enhance datasets. Single- and multi-task prediction models were established by extreme gradient boosting (XGB), random forest (RF), gradient boosting decision tree (GBDT), and deep neural network (DNN). Shapely values and partial dependence were utilized to interpret the impact of input features on output responses. Results indicated that XGB model (average train R2 = 0.95, RMSE = 1.29, MAE = 0.69, average test R2 = 0.91, RMSE = 1.30, MAE = 0.93) performed the most optimally after applying GNN, with an average performance improvement of 6.74–7.95 %. The model demonstrated outstanding performance on unknown data, achieving an average R2 of 0.916. Temperature and biomass ultimate analysis emerged as pivotal features influencing output. Triglycerides exhibited a stronger influence than fatty acids owing to their higher carbon content. A combination of high temperature (>350 °C) and elevated triglyceride content (>30 %) increased bio-oil yield (∼40 wt%) and HHV (∼35 MJ/kg).
期刊介绍:
Energy is a multidisciplinary, international journal that publishes research and analysis in the field of energy engineering. Our aim is to become a leading peer-reviewed platform and a trusted source of information for energy-related topics.
The journal covers a range of areas including mechanical engineering, thermal sciences, and energy analysis. We are particularly interested in research on energy modelling, prediction, integrated energy systems, planning, and management.
Additionally, we welcome papers on energy conservation, efficiency, biomass and bioenergy, renewable energy, electricity supply and demand, energy storage, buildings, and economic and policy issues. These topics should align with our broader multidisciplinary focus.