Kun Xiang , Hualin Chen , Chuanqing Yao , Yuanyuan Wan , Han Ren , Jiangmin Zhou
{"title":"基于SHAP的数据驱动机器学习用于预测农业和林业废弃物中的多种生物炭特性","authors":"Kun Xiang , Hualin Chen , Chuanqing Yao , Yuanyuan Wan , Han Ren , Jiangmin Zhou","doi":"10.1016/j.biombioe.2025.107958","DOIUrl":null,"url":null,"abstract":"<div><div>The pyrolysis of agricultural and forestry waste for biochar production is an effective method for resource utilization and carbon sequestration. The diversity of these wastes leads to variations in biochar properties, limiting its applications. In recent years, machine learning (ML) models have been developed to predict biochar properties. However, there is currently a lack of a generalized predictive model capable of simultaneously evaluating multiple properties of biochar. This study systematically evaluated the performance of six ML models—CatBoost, LightGBM, NGBoost, XGBoost, RandomForest, and AdaBoost—using 288 pyrolysis datasets from 43 types of agricultural and forestry waste to predict biochar yield (BY), carbon content (C), fixed carbon content (FC), ash content, energy yield (EY), higher heating value (HHV), pH, and cation exchange capacity (CEC). Among these models, CatBoost demonstrated the best performance, with R<sup>2</sup> values ranging from 0.76 to 0.95. Shapley Additive Explanations (SHAP) analysis identified pyrolysis temperature as a key factor influencing biochar yield, carbon content, and pH, while feedstock composition (e.g., cellulose and lignin) was found to significantly affect CEC and FC. The CatBoost model could be used for the dynamic adjustment of pyrolysis parameters based on feedstock, therefore, a modular visual prediction tool was developed using CatBoost for the large-scale production of specific biochar for specific applications by allowing the precise selection of corresponding types of feedstock and pyrolysis conditions.</div></div>","PeriodicalId":253,"journal":{"name":"Biomass & Bioenergy","volume":"200 ","pages":"Article 107958"},"PeriodicalIF":5.8000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data-driven machine learning with SHAP for predicting multiple biochar properties from agricultural and forestry waste\",\"authors\":\"Kun Xiang , Hualin Chen , Chuanqing Yao , Yuanyuan Wan , Han Ren , Jiangmin Zhou\",\"doi\":\"10.1016/j.biombioe.2025.107958\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The pyrolysis of agricultural and forestry waste for biochar production is an effective method for resource utilization and carbon sequestration. The diversity of these wastes leads to variations in biochar properties, limiting its applications. In recent years, machine learning (ML) models have been developed to predict biochar properties. However, there is currently a lack of a generalized predictive model capable of simultaneously evaluating multiple properties of biochar. This study systematically evaluated the performance of six ML models—CatBoost, LightGBM, NGBoost, XGBoost, RandomForest, and AdaBoost—using 288 pyrolysis datasets from 43 types of agricultural and forestry waste to predict biochar yield (BY), carbon content (C), fixed carbon content (FC), ash content, energy yield (EY), higher heating value (HHV), pH, and cation exchange capacity (CEC). Among these models, CatBoost demonstrated the best performance, with R<sup>2</sup> values ranging from 0.76 to 0.95. Shapley Additive Explanations (SHAP) analysis identified pyrolysis temperature as a key factor influencing biochar yield, carbon content, and pH, while feedstock composition (e.g., cellulose and lignin) was found to significantly affect CEC and FC. The CatBoost model could be used for the dynamic adjustment of pyrolysis parameters based on feedstock, therefore, a modular visual prediction tool was developed using CatBoost for the large-scale production of specific biochar for specific applications by allowing the precise selection of corresponding types of feedstock and pyrolysis conditions.</div></div>\",\"PeriodicalId\":253,\"journal\":{\"name\":\"Biomass & Bioenergy\",\"volume\":\"200 \",\"pages\":\"Article 107958\"},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2025-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biomass & Bioenergy\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0961953425003691\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURAL ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomass & Bioenergy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0961953425003691","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
Data-driven machine learning with SHAP for predicting multiple biochar properties from agricultural and forestry waste
The pyrolysis of agricultural and forestry waste for biochar production is an effective method for resource utilization and carbon sequestration. The diversity of these wastes leads to variations in biochar properties, limiting its applications. In recent years, machine learning (ML) models have been developed to predict biochar properties. However, there is currently a lack of a generalized predictive model capable of simultaneously evaluating multiple properties of biochar. This study systematically evaluated the performance of six ML models—CatBoost, LightGBM, NGBoost, XGBoost, RandomForest, and AdaBoost—using 288 pyrolysis datasets from 43 types of agricultural and forestry waste to predict biochar yield (BY), carbon content (C), fixed carbon content (FC), ash content, energy yield (EY), higher heating value (HHV), pH, and cation exchange capacity (CEC). Among these models, CatBoost demonstrated the best performance, with R2 values ranging from 0.76 to 0.95. Shapley Additive Explanations (SHAP) analysis identified pyrolysis temperature as a key factor influencing biochar yield, carbon content, and pH, while feedstock composition (e.g., cellulose and lignin) was found to significantly affect CEC and FC. The CatBoost model could be used for the dynamic adjustment of pyrolysis parameters based on feedstock, therefore, a modular visual prediction tool was developed using CatBoost for the large-scale production of specific biochar for specific applications by allowing the precise selection of corresponding types of feedstock and pyrolysis conditions.
期刊介绍:
Biomass & Bioenergy is an international journal publishing original research papers and short communications, review articles and case studies on biological resources, chemical and biological processes, and biomass products for new renewable sources of energy and materials.
The scope of the journal extends to the environmental, management and economic aspects of biomass and bioenergy.
Key areas covered by the journal:
• Biomass: sources, energy crop production processes, genetic improvements, composition. Please note that research on these biomass subjects must be linked directly to bioenergy generation.
• Biological Residues: residues/rests from agricultural production, forestry and plantations (palm, sugar etc), processing industries, and municipal sources (MSW). Papers on the use of biomass residues through innovative processes/technological novelty and/or consideration of feedstock/system sustainability (or unsustainability) are welcomed. However waste treatment processes and pollution control or mitigation which are only tangentially related to bioenergy are not in the scope of the journal, as they are more suited to publications in the environmental arena. Papers that describe conventional waste streams (ie well described in existing literature) that do not empirically address ''new'' added value from the process are not suitable for submission to the journal.
• Bioenergy Processes: fermentations, thermochemical conversions, liquid and gaseous fuels, and petrochemical substitutes
• Bioenergy Utilization: direct combustion, gasification, electricity production, chemical processes, and by-product remediation
• Biomass and the Environment: carbon cycle, the net energy efficiency of bioenergy systems, assessment of sustainability, and biodiversity issues.