Rong Huang , Hui Xu , Ezequiel Santillan , Di Jin , Zhenju Sun , David C. Stuckey , Yan Zhou , Stefan Wuertz , Shunzhi Qian
{"title":"利用集成学习预测顺序批式反应器中食品加工废水的单细胞蛋白质生产","authors":"Rong Huang , Hui Xu , Ezequiel Santillan , Di Jin , Zhenju Sun , David C. Stuckey , Yan Zhou , Stefan Wuertz , Shunzhi Qian","doi":"10.1016/j.biortech.2025.132561","DOIUrl":null,"url":null,"abstract":"<div><div>Producing single-cell protein (SCP) from food-processing wastewater offers a sustainable approach to resource recovery, animal feed production, and wastewater treatment. Decision-makers need accurate system performance data under variable influent conditions to select operational parameters for efficiency. However, predicting system performance under variable conditions is challenging due to the complexity of unsteady-state bioreactions. This study trained and tested ensemble learning algorithms, including the ensemble of Support Vector Regression, the ensemble of Gaussian Process Regression (GPR), Random Forest, and Extreme Gradient Boosting, to predict outcomes in a continuous-inflow, sequencing-batch-reactor-based SCP system using industrial soybean-processing wastewater. Interpretable analysis and trials validate feature significance for model optimization. Results show that ensemble-learning models, particularly GPR-based ones, outperform linear regression in predicting key effluent and biomass variables essential for operational decision-making. Notably, GPR-based ensembles with influential features predict biomass production (coefficient of determination (R<sup>2</sup>) = 0.72) against overfitting much better than linear regression (R<sup>2</sup> = 0.4).</div></div>","PeriodicalId":258,"journal":{"name":"Bioresource Technology","volume":"430 ","pages":"Article 132561"},"PeriodicalIF":9.7000,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Predicting single-cell protein production from food-processing wastewater in sequencing batch reactors using ensemble learning\",\"authors\":\"Rong Huang , Hui Xu , Ezequiel Santillan , Di Jin , Zhenju Sun , David C. Stuckey , Yan Zhou , Stefan Wuertz , Shunzhi Qian\",\"doi\":\"10.1016/j.biortech.2025.132561\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Producing single-cell protein (SCP) from food-processing wastewater offers a sustainable approach to resource recovery, animal feed production, and wastewater treatment. Decision-makers need accurate system performance data under variable influent conditions to select operational parameters for efficiency. However, predicting system performance under variable conditions is challenging due to the complexity of unsteady-state bioreactions. This study trained and tested ensemble learning algorithms, including the ensemble of Support Vector Regression, the ensemble of Gaussian Process Regression (GPR), Random Forest, and Extreme Gradient Boosting, to predict outcomes in a continuous-inflow, sequencing-batch-reactor-based SCP system using industrial soybean-processing wastewater. Interpretable analysis and trials validate feature significance for model optimization. Results show that ensemble-learning models, particularly GPR-based ones, outperform linear regression in predicting key effluent and biomass variables essential for operational decision-making. Notably, GPR-based ensembles with influential features predict biomass production (coefficient of determination (R<sup>2</sup>) = 0.72) against overfitting much better than linear regression (R<sup>2</sup> = 0.4).</div></div>\",\"PeriodicalId\":258,\"journal\":{\"name\":\"Bioresource Technology\",\"volume\":\"430 \",\"pages\":\"Article 132561\"},\"PeriodicalIF\":9.7000,\"publicationDate\":\"2025-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioresource Technology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0960852425005279\",\"RegionNum\":1,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURAL ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioresource Technology","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0960852425005279","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
Predicting single-cell protein production from food-processing wastewater in sequencing batch reactors using ensemble learning
Producing single-cell protein (SCP) from food-processing wastewater offers a sustainable approach to resource recovery, animal feed production, and wastewater treatment. Decision-makers need accurate system performance data under variable influent conditions to select operational parameters for efficiency. However, predicting system performance under variable conditions is challenging due to the complexity of unsteady-state bioreactions. This study trained and tested ensemble learning algorithms, including the ensemble of Support Vector Regression, the ensemble of Gaussian Process Regression (GPR), Random Forest, and Extreme Gradient Boosting, to predict outcomes in a continuous-inflow, sequencing-batch-reactor-based SCP system using industrial soybean-processing wastewater. Interpretable analysis and trials validate feature significance for model optimization. Results show that ensemble-learning models, particularly GPR-based ones, outperform linear regression in predicting key effluent and biomass variables essential for operational decision-making. Notably, GPR-based ensembles with influential features predict biomass production (coefficient of determination (R2) = 0.72) against overfitting much better than linear regression (R2 = 0.4).
期刊介绍:
Bioresource Technology publishes original articles, review articles, case studies, and short communications covering the fundamentals, applications, and management of bioresource technology. The journal seeks to advance and disseminate knowledge across various areas related to biomass, biological waste treatment, bioenergy, biotransformations, bioresource systems analysis, and associated conversion or production technologies.
Topics include:
• Biofuels: liquid and gaseous biofuels production, modeling and economics
• Bioprocesses and bioproducts: biocatalysis and fermentations
• Biomass and feedstocks utilization: bioconversion of agro-industrial residues
• Environmental protection: biological waste treatment
• Thermochemical conversion of biomass: combustion, pyrolysis, gasification, catalysis.