Data-driven machine learning with SHAP for predicting multiple biochar properties from agricultural and forestry waste

IF 5.8 2区 生物学 Q1 AGRICULTURAL ENGINEERING
Kun Xiang , Hualin Chen , Chuanqing Yao , Yuanyuan Wan , Han Ren , Jiangmin Zhou
{"title":"Data-driven machine learning with SHAP for predicting multiple biochar properties from agricultural and forestry waste","authors":"Kun Xiang ,&nbsp;Hualin Chen ,&nbsp;Chuanqing Yao ,&nbsp;Yuanyuan Wan ,&nbsp;Han Ren ,&nbsp;Jiangmin Zhou","doi":"10.1016/j.biombioe.2025.107958","DOIUrl":null,"url":null,"abstract":"<div><div>The pyrolysis of agricultural and forestry waste for biochar production is an effective method for resource utilization and carbon sequestration. The diversity of these wastes leads to variations in biochar properties, limiting its applications. In recent years, machine learning (ML) models have been developed to predict biochar properties. However, there is currently a lack of a generalized predictive model capable of simultaneously evaluating multiple properties of biochar. This study systematically evaluated the performance of six ML models—CatBoost, LightGBM, NGBoost, XGBoost, RandomForest, and AdaBoost—using 288 pyrolysis datasets from 43 types of agricultural and forestry waste to predict biochar yield (BY), carbon content (C), fixed carbon content (FC), ash content, energy yield (EY), higher heating value (HHV), pH, and cation exchange capacity (CEC). Among these models, CatBoost demonstrated the best performance, with R<sup>2</sup> values ranging from 0.76 to 0.95. Shapley Additive Explanations (SHAP) analysis identified pyrolysis temperature as a key factor influencing biochar yield, carbon content, and pH, while feedstock composition (e.g., cellulose and lignin) was found to significantly affect CEC and FC. The CatBoost model could be used for the dynamic adjustment of pyrolysis parameters based on feedstock, therefore, a modular visual prediction tool was developed using CatBoost for the large-scale production of specific biochar for specific applications by allowing the precise selection of corresponding types of feedstock and pyrolysis conditions.</div></div>","PeriodicalId":253,"journal":{"name":"Biomass & Bioenergy","volume":"200 ","pages":"Article 107958"},"PeriodicalIF":5.8000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomass & Bioenergy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0961953425003691","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

The pyrolysis of agricultural and forestry waste for biochar production is an effective method for resource utilization and carbon sequestration. The diversity of these wastes leads to variations in biochar properties, limiting its applications. In recent years, machine learning (ML) models have been developed to predict biochar properties. However, there is currently a lack of a generalized predictive model capable of simultaneously evaluating multiple properties of biochar. This study systematically evaluated the performance of six ML models—CatBoost, LightGBM, NGBoost, XGBoost, RandomForest, and AdaBoost—using 288 pyrolysis datasets from 43 types of agricultural and forestry waste to predict biochar yield (BY), carbon content (C), fixed carbon content (FC), ash content, energy yield (EY), higher heating value (HHV), pH, and cation exchange capacity (CEC). Among these models, CatBoost demonstrated the best performance, with R2 values ranging from 0.76 to 0.95. Shapley Additive Explanations (SHAP) analysis identified pyrolysis temperature as a key factor influencing biochar yield, carbon content, and pH, while feedstock composition (e.g., cellulose and lignin) was found to significantly affect CEC and FC. The CatBoost model could be used for the dynamic adjustment of pyrolysis parameters based on feedstock, therefore, a modular visual prediction tool was developed using CatBoost for the large-scale production of specific biochar for specific applications by allowing the precise selection of corresponding types of feedstock and pyrolysis conditions.
基于SHAP的数据驱动机器学习用于预测农业和林业废弃物中的多种生物炭特性
农林废弃物热解生产生物炭是一种有效的资源利用和固碳方法。这些废物的多样性导致生物炭特性的变化,限制了其应用。近年来,机器学习(ML)模型被用于预测生物炭的性质。然而,目前缺乏一种能够同时评估生物炭多种特性的通用预测模型。本研究系统评估了catboost、LightGBM、NGBoost、XGBoost、RandomForest和adaboost 6种ML模型的性能,使用来自43种农林废弃物的288个热解数据集预测生物炭产率(BY)、碳含量(C)、固定碳含量(FC)、灰分含量、产能(EY)、高热值(HHV)、pH和阳离子交换容量(CEC)。其中CatBoost模型表现最好,R2值在0.76 ~ 0.95之间。Shapley Additive explanation (SHAP)分析发现,热解温度是影响生物炭产率、碳含量和pH值的关键因素,而原料组成(如纤维素和木质素)则显著影响CEC和FC。CatBoost模型可用于基于原料的热解参数动态调整,因此,通过精确选择相应的原料类型和热解条件,利用CatBoost开发了一个模块化的可视化预测工具,用于特定应用的特定生物炭的大规模生产。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biomass & Bioenergy
Biomass & Bioenergy 工程技术-能源与燃料
CiteScore
11.50
自引率
3.30%
发文量
258
审稿时长
60 days
期刊介绍: Biomass & Bioenergy is an international journal publishing original research papers and short communications, review articles and case studies on biological resources, chemical and biological processes, and biomass products for new renewable sources of energy and materials. The scope of the journal extends to the environmental, management and economic aspects of biomass and bioenergy. Key areas covered by the journal: • Biomass: sources, energy crop production processes, genetic improvements, composition. Please note that research on these biomass subjects must be linked directly to bioenergy generation. • Biological Residues: residues/rests from agricultural production, forestry and plantations (palm, sugar etc), processing industries, and municipal sources (MSW). Papers on the use of biomass residues through innovative processes/technological novelty and/or consideration of feedstock/system sustainability (or unsustainability) are welcomed. However waste treatment processes and pollution control or mitigation which are only tangentially related to bioenergy are not in the scope of the journal, as they are more suited to publications in the environmental arena. Papers that describe conventional waste streams (ie well described in existing literature) that do not empirically address ''new'' added value from the process are not suitable for submission to the journal. • Bioenergy Processes: fermentations, thermochemical conversions, liquid and gaseous fuels, and petrochemical substitutes • Bioenergy Utilization: direct combustion, gasification, electricity production, chemical processes, and by-product remediation • Biomass and the Environment: carbon cycle, the net energy efficiency of bioenergy systems, assessment of sustainability, and biodiversity issues.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信