{"title":"预测各种预处理后木质纤维素生物质酶解产率的机器学习模型","authors":"Tiantian Xie , Meishan Fan","doi":"10.1016/j.indcrop.2025.121644","DOIUrl":null,"url":null,"abstract":"<div><div>This study presents a comparative computational framework of six machine learning architectures to predict enzymatic bioconversion efficiency of lignocellulosic biomass, utilizing multidimensional inputs spanning compositional signatures and pretreatment operational parameters. Support vector regression (SVR) demonstrated superior performance among all models, with R² values exceeding 0.90 for test sets of target variables. After optimization, SVR finally achieved coefficients of determination (<em>R</em><sup><em>2</em></sup>) of 0.95 and 0.99 for glucose and xylose yields, respectively. Solution pH was identified as the dominant factor influencing chemical composition, structural characteristics, and solid yield during pretreatment. Through SHapley Additive exPlanations (SHAP) and gradient-based importance quantification, first-principal interpretations of feature-response relationships were established, revealing nonlinear interdependencies between pretreatment-induced structural modifications and subsequent enzymatic accessibility. A software tool was engineered to precisely predict glucose and xylose yields following different pretreatments. This study offers novel insights into critical determinants and their synergistic relationships affecting the pretreatment process and enzymatic hydrolysis yields for lignocellulosic biomass.</div></div>","PeriodicalId":13581,"journal":{"name":"Industrial Crops and Products","volume":"235 ","pages":"Article 121644"},"PeriodicalIF":6.2000,"publicationDate":"2025-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning models for predicting enzymatic hydrolysis yields of lignocellulosic biomass after various pretreatments\",\"authors\":\"Tiantian Xie , Meishan Fan\",\"doi\":\"10.1016/j.indcrop.2025.121644\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This study presents a comparative computational framework of six machine learning architectures to predict enzymatic bioconversion efficiency of lignocellulosic biomass, utilizing multidimensional inputs spanning compositional signatures and pretreatment operational parameters. Support vector regression (SVR) demonstrated superior performance among all models, with R² values exceeding 0.90 for test sets of target variables. After optimization, SVR finally achieved coefficients of determination (<em>R</em><sup><em>2</em></sup>) of 0.95 and 0.99 for glucose and xylose yields, respectively. Solution pH was identified as the dominant factor influencing chemical composition, structural characteristics, and solid yield during pretreatment. Through SHapley Additive exPlanations (SHAP) and gradient-based importance quantification, first-principal interpretations of feature-response relationships were established, revealing nonlinear interdependencies between pretreatment-induced structural modifications and subsequent enzymatic accessibility. A software tool was engineered to precisely predict glucose and xylose yields following different pretreatments. This study offers novel insights into critical determinants and their synergistic relationships affecting the pretreatment process and enzymatic hydrolysis yields for lignocellulosic biomass.</div></div>\",\"PeriodicalId\":13581,\"journal\":{\"name\":\"Industrial Crops and Products\",\"volume\":\"235 \",\"pages\":\"Article 121644\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2025-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Industrial Crops and Products\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0926669025011902\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURAL ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Industrial Crops and Products","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0926669025011902","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
Machine learning models for predicting enzymatic hydrolysis yields of lignocellulosic biomass after various pretreatments
This study presents a comparative computational framework of six machine learning architectures to predict enzymatic bioconversion efficiency of lignocellulosic biomass, utilizing multidimensional inputs spanning compositional signatures and pretreatment operational parameters. Support vector regression (SVR) demonstrated superior performance among all models, with R² values exceeding 0.90 for test sets of target variables. After optimization, SVR finally achieved coefficients of determination (R2) of 0.95 and 0.99 for glucose and xylose yields, respectively. Solution pH was identified as the dominant factor influencing chemical composition, structural characteristics, and solid yield during pretreatment. Through SHapley Additive exPlanations (SHAP) and gradient-based importance quantification, first-principal interpretations of feature-response relationships were established, revealing nonlinear interdependencies between pretreatment-induced structural modifications and subsequent enzymatic accessibility. A software tool was engineered to precisely predict glucose and xylose yields following different pretreatments. This study offers novel insights into critical determinants and their synergistic relationships affecting the pretreatment process and enzymatic hydrolysis yields for lignocellulosic biomass.
期刊介绍:
Industrial Crops and Products is an International Journal publishing academic and industrial research on industrial (defined as non-food/non-feed) crops and products. Papers concern both crop-oriented and bio-based materials from crops-oriented research, and should be of interest to an international audience, hypothesis driven, and where comparisons are made statistics performed.