Yanming Qiao , Ehsan Kargaran , Hao Ji , Meysam Madadi , Saeed Rafieyan , Dan Liu
{"title":"使用机器学习增强纤维素转化为5-羟甲基糠醛的数据驱动见解","authors":"Yanming Qiao , Ehsan Kargaran , Hao Ji , Meysam Madadi , Saeed Rafieyan , Dan Liu","doi":"10.1016/j.biortech.2025.132582","DOIUrl":null,"url":null,"abstract":"<div><div>Converting cellulose into 5-Hydroxymethylfurfural (HMF) provides a promising strategy for creating bio-based chemicals, offering sustainable alternatives to petroleum-based materials in polymers, biofuels, and pharmaceuticals. However, the efficient production of HMF from cellulose is challenged by the complex interplay of numerous operational variables. This study develops a machine learning (ML) model to optimize HMF production and conducts a feature importance analysis to identify the key factors affecting HMF yield. Additionally, a Bayesian optimization is employed for multi-objective optimization aimed at maximizing HMF yield. A comprehensive dataset, sourced from existing literature, was subjected to statistical analysis to elucidate the influence of each factor on HMF production. Among the eight models evaluated, the CatBoost Regressor emerged as the most effective, delivering robust predictive performance with R<sup>2</sup> of 0.76 during testing and exhibiting low RMSE (4.72) and MAE (5.2) values. Feature importance analysis revealed that operational conditions, particularly time and temperature, were the most significant, accounting for 41.0% of the variability, followed by catalyst properties at 33.0% and solvent properties at 26.0%. The ML-based optimization achieved an HMF yield of 48.1%, with relative errors of −1% and 2.5% in the first (47.6%) and second (49.3%) runs of experimental validation, respectively. This research showcases ML’s ability to address challenges in cellulose-to-HMF conversion, offering insights for optimizing production and advancing sustainable manufacturing.</div></div>","PeriodicalId":258,"journal":{"name":"Bioresource Technology","volume":"430 ","pages":"Article 132582"},"PeriodicalIF":9.7000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data-driven insights for enhanced cellulose conversion to 5-hydroxymethylfurfural using machine learning\",\"authors\":\"Yanming Qiao , Ehsan Kargaran , Hao Ji , Meysam Madadi , Saeed Rafieyan , Dan Liu\",\"doi\":\"10.1016/j.biortech.2025.132582\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Converting cellulose into 5-Hydroxymethylfurfural (HMF) provides a promising strategy for creating bio-based chemicals, offering sustainable alternatives to petroleum-based materials in polymers, biofuels, and pharmaceuticals. However, the efficient production of HMF from cellulose is challenged by the complex interplay of numerous operational variables. This study develops a machine learning (ML) model to optimize HMF production and conducts a feature importance analysis to identify the key factors affecting HMF yield. Additionally, a Bayesian optimization is employed for multi-objective optimization aimed at maximizing HMF yield. A comprehensive dataset, sourced from existing literature, was subjected to statistical analysis to elucidate the influence of each factor on HMF production. Among the eight models evaluated, the CatBoost Regressor emerged as the most effective, delivering robust predictive performance with R<sup>2</sup> of 0.76 during testing and exhibiting low RMSE (4.72) and MAE (5.2) values. Feature importance analysis revealed that operational conditions, particularly time and temperature, were the most significant, accounting for 41.0% of the variability, followed by catalyst properties at 33.0% and solvent properties at 26.0%. The ML-based optimization achieved an HMF yield of 48.1%, with relative errors of −1% and 2.5% in the first (47.6%) and second (49.3%) runs of experimental validation, respectively. This research showcases ML’s ability to address challenges in cellulose-to-HMF conversion, offering insights for optimizing production and advancing sustainable manufacturing.</div></div>\",\"PeriodicalId\":258,\"journal\":{\"name\":\"Bioresource Technology\",\"volume\":\"430 \",\"pages\":\"Article 132582\"},\"PeriodicalIF\":9.7000,\"publicationDate\":\"2025-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioresource Technology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0960852425005486\",\"RegionNum\":1,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURAL ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioresource Technology","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0960852425005486","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
Data-driven insights for enhanced cellulose conversion to 5-hydroxymethylfurfural using machine learning
Converting cellulose into 5-Hydroxymethylfurfural (HMF) provides a promising strategy for creating bio-based chemicals, offering sustainable alternatives to petroleum-based materials in polymers, biofuels, and pharmaceuticals. However, the efficient production of HMF from cellulose is challenged by the complex interplay of numerous operational variables. This study develops a machine learning (ML) model to optimize HMF production and conducts a feature importance analysis to identify the key factors affecting HMF yield. Additionally, a Bayesian optimization is employed for multi-objective optimization aimed at maximizing HMF yield. A comprehensive dataset, sourced from existing literature, was subjected to statistical analysis to elucidate the influence of each factor on HMF production. Among the eight models evaluated, the CatBoost Regressor emerged as the most effective, delivering robust predictive performance with R2 of 0.76 during testing and exhibiting low RMSE (4.72) and MAE (5.2) values. Feature importance analysis revealed that operational conditions, particularly time and temperature, were the most significant, accounting for 41.0% of the variability, followed by catalyst properties at 33.0% and solvent properties at 26.0%. The ML-based optimization achieved an HMF yield of 48.1%, with relative errors of −1% and 2.5% in the first (47.6%) and second (49.3%) runs of experimental validation, respectively. This research showcases ML’s ability to address challenges in cellulose-to-HMF conversion, offering insights for optimizing production and advancing sustainable manufacturing.
期刊介绍:
Bioresource Technology publishes original articles, review articles, case studies, and short communications covering the fundamentals, applications, and management of bioresource technology. The journal seeks to advance and disseminate knowledge across various areas related to biomass, biological waste treatment, bioenergy, biotransformations, bioresource systems analysis, and associated conversion or production technologies.
Topics include:
• Biofuels: liquid and gaseous biofuels production, modeling and economics
• Bioprocesses and bioproducts: biocatalysis and fermentations
• Biomass and feedstocks utilization: bioconversion of agro-industrial residues
• Environmental protection: biological waste treatment
• Thermochemical conversion of biomass: combustion, pyrolysis, gasification, catalysis.