利用集成学习预测顺序批式反应器中食品加工废水的单细胞蛋白质生产

IF 9.7 1区 环境科学与生态学 Q1 AGRICULTURAL ENGINEERING
Rong Huang , Hui Xu , Ezequiel Santillan , Di Jin , Zhenju Sun , David C. Stuckey , Yan Zhou , Stefan Wuertz , Shunzhi Qian
{"title":"利用集成学习预测顺序批式反应器中食品加工废水的单细胞蛋白质生产","authors":"Rong Huang ,&nbsp;Hui Xu ,&nbsp;Ezequiel Santillan ,&nbsp;Di Jin ,&nbsp;Zhenju Sun ,&nbsp;David C. Stuckey ,&nbsp;Yan Zhou ,&nbsp;Stefan Wuertz ,&nbsp;Shunzhi Qian","doi":"10.1016/j.biortech.2025.132561","DOIUrl":null,"url":null,"abstract":"<div><div>Producing single-cell protein (SCP) from food-processing wastewater offers a sustainable approach to resource recovery, animal feed production, and wastewater treatment. Decision-makers need accurate system performance data under variable influent conditions to select operational parameters for efficiency. However, predicting system performance under variable conditions is challenging due to the complexity of unsteady-state bioreactions. This study trained and tested ensemble learning algorithms, including the ensemble of Support Vector Regression, the ensemble of Gaussian Process Regression (GPR), Random Forest, and Extreme Gradient Boosting, to predict outcomes in a continuous-inflow, sequencing-batch-reactor-based SCP system using industrial soybean-processing wastewater. Interpretable analysis and trials validate feature significance for model optimization. Results show that ensemble-learning models, particularly GPR-based ones, outperform linear regression in predicting key effluent and biomass variables essential for operational decision-making. Notably, GPR-based ensembles with influential features predict biomass production (coefficient of determination (R<sup>2</sup>) = 0.72) against overfitting much better than linear regression (R<sup>2</sup> = 0.4).</div></div>","PeriodicalId":258,"journal":{"name":"Bioresource Technology","volume":"430 ","pages":"Article 132561"},"PeriodicalIF":9.7000,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Predicting single-cell protein production from food-processing wastewater in sequencing batch reactors using ensemble learning\",\"authors\":\"Rong Huang ,&nbsp;Hui Xu ,&nbsp;Ezequiel Santillan ,&nbsp;Di Jin ,&nbsp;Zhenju Sun ,&nbsp;David C. Stuckey ,&nbsp;Yan Zhou ,&nbsp;Stefan Wuertz ,&nbsp;Shunzhi Qian\",\"doi\":\"10.1016/j.biortech.2025.132561\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Producing single-cell protein (SCP) from food-processing wastewater offers a sustainable approach to resource recovery, animal feed production, and wastewater treatment. Decision-makers need accurate system performance data under variable influent conditions to select operational parameters for efficiency. However, predicting system performance under variable conditions is challenging due to the complexity of unsteady-state bioreactions. This study trained and tested ensemble learning algorithms, including the ensemble of Support Vector Regression, the ensemble of Gaussian Process Regression (GPR), Random Forest, and Extreme Gradient Boosting, to predict outcomes in a continuous-inflow, sequencing-batch-reactor-based SCP system using industrial soybean-processing wastewater. Interpretable analysis and trials validate feature significance for model optimization. Results show that ensemble-learning models, particularly GPR-based ones, outperform linear regression in predicting key effluent and biomass variables essential for operational decision-making. Notably, GPR-based ensembles with influential features predict biomass production (coefficient of determination (R<sup>2</sup>) = 0.72) against overfitting much better than linear regression (R<sup>2</sup> = 0.4).</div></div>\",\"PeriodicalId\":258,\"journal\":{\"name\":\"Bioresource Technology\",\"volume\":\"430 \",\"pages\":\"Article 132561\"},\"PeriodicalIF\":9.7000,\"publicationDate\":\"2025-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioresource Technology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0960852425005279\",\"RegionNum\":1,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURAL ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioresource Technology","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0960852425005279","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0

摘要

从食品加工废水中生产单细胞蛋白(SCP)为资源回收、动物饲料生产和废水处理提供了一种可持续的方法。决策者需要准确的系统在不同进水条件下的性能数据来选择运行参数以提高效率。然而,由于非稳态生物反应的复杂性,预测系统在可变条件下的性能是具有挑战性的。本研究训练并测试了集成学习算法,包括支持向量回归集成、高斯过程回归(GPR)集成、随机森林集成和极端梯度增强集成,以预测使用工业大豆加工废水的连续流入、排序批处理反应器的SCP系统的结果。可解释分析和试验验证了特征对模型优化的重要性。结果表明,集成学习模型,特别是基于gpr的模型,在预测运营决策所必需的关键流出物和生物量变量方面优于线性回归。值得注意的是,具有影响特征的基于gpr的集合预测生物量产量(决定系数(R2) = 0.72)优于线性回归(R2 = 0.4)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Predicting single-cell protein production from food-processing wastewater in sequencing batch reactors using ensemble learning

Predicting single-cell protein production from food-processing wastewater in sequencing batch reactors using ensemble learning
Producing single-cell protein (SCP) from food-processing wastewater offers a sustainable approach to resource recovery, animal feed production, and wastewater treatment. Decision-makers need accurate system performance data under variable influent conditions to select operational parameters for efficiency. However, predicting system performance under variable conditions is challenging due to the complexity of unsteady-state bioreactions. This study trained and tested ensemble learning algorithms, including the ensemble of Support Vector Regression, the ensemble of Gaussian Process Regression (GPR), Random Forest, and Extreme Gradient Boosting, to predict outcomes in a continuous-inflow, sequencing-batch-reactor-based SCP system using industrial soybean-processing wastewater. Interpretable analysis and trials validate feature significance for model optimization. Results show that ensemble-learning models, particularly GPR-based ones, outperform linear regression in predicting key effluent and biomass variables essential for operational decision-making. Notably, GPR-based ensembles with influential features predict biomass production (coefficient of determination (R2) = 0.72) against overfitting much better than linear regression (R2 = 0.4).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Bioresource Technology
Bioresource Technology 工程技术-能源与燃料
CiteScore
20.80
自引率
19.30%
发文量
2013
审稿时长
12 days
期刊介绍: Bioresource Technology publishes original articles, review articles, case studies, and short communications covering the fundamentals, applications, and management of bioresource technology. The journal seeks to advance and disseminate knowledge across various areas related to biomass, biological waste treatment, bioenergy, biotransformations, bioresource systems analysis, and associated conversion or production technologies. Topics include: • Biofuels: liquid and gaseous biofuels production, modeling and economics • Bioprocesses and bioproducts: biocatalysis and fermentations • Biomass and feedstocks utilization: bioconversion of agro-industrial residues • Environmental protection: biological waste treatment • Thermochemical conversion of biomass: combustion, pyrolysis, gasification, catalysis.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信