Germán Buitrón , Torsten Meyer , Elizabeth A. Edwards , Virginia Montiel-Corona
{"title":"利用机器学习对紫色光养细菌的生物产物生成进行建模和优化","authors":"Germán Buitrón , Torsten Meyer , Elizabeth A. Edwards , Virginia Montiel-Corona","doi":"10.1016/j.biortech.2025.132963","DOIUrl":null,"url":null,"abstract":"<div><div>Municipal and industrial wastewater, along with organic waste, can be transformed into valuable bioproducts using purple phototrophic bacteria. This study compares the performance of three machine learning models (Random Forest, XGBoost, and CatBoost) in predicting and optimizing the formation of key bioproducts: polyhydroxybutyrate, polyhydroxyvalerate, 5-aminolevulinic acid, coenzyme Q10, carotenoids, bacteriochlorophylls, and biomass. The models were trained on a dataset compiled from previous studies, using input variables such as reaction time, concentration of organic matter, ethanol, bicarbonate, levulinic acid, ferric citrate, mineral medium, and N, C/N ratio, illumination conditions (continuous or intermittent), operation mode (batch or semicontinuous), and volume exchange percentage. Bayesian optimization was applied to train and tune the models. Performance was assessed using R<sup>2</sup>, Pearson correlation, RMSE, and MAPE. CatBoost outperformed the others, showing higher predictive correlation and lower error. It was subsequently used for further optimization. Feature importance analysis identified reaction time, mineral medium concentration, and volume exchange percentage as key drivers of bioproduct synthesis. The Particle Swarm Optimization algorithm was applied to determine optimal conditions for each target compound. Under the conditions studied, predicted maximum yields were: 569 mg polyhydroxybutyrate/L, 45 mg polyhydroxyvalerate/L, 79 µmol 5-aminolevulinic acid/L, 13 mg coenzyme Q10/g dw, 7 mg carotenoids/g dw, 17 mg bacteriochlorophylls/g dw, and 2040 mg biomass/L. Optimization suggests that operating as a sequencing batch reactor and employing discontinuous illumination for most targets, along with a reduced mineral medium concentration, is beneficial. Results highlight that each bioproduct requires distinct operational settings, supporting the idea of clustering target compounds.</div></div>","PeriodicalId":258,"journal":{"name":"Bioresource Technology","volume":"436 ","pages":"Article 132963"},"PeriodicalIF":9.7000,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Modeling and optimization of bioproduct formation with purple phototrophic bacteria using machine learning\",\"authors\":\"Germán Buitrón , Torsten Meyer , Elizabeth A. Edwards , Virginia Montiel-Corona\",\"doi\":\"10.1016/j.biortech.2025.132963\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Municipal and industrial wastewater, along with organic waste, can be transformed into valuable bioproducts using purple phototrophic bacteria. This study compares the performance of three machine learning models (Random Forest, XGBoost, and CatBoost) in predicting and optimizing the formation of key bioproducts: polyhydroxybutyrate, polyhydroxyvalerate, 5-aminolevulinic acid, coenzyme Q10, carotenoids, bacteriochlorophylls, and biomass. The models were trained on a dataset compiled from previous studies, using input variables such as reaction time, concentration of organic matter, ethanol, bicarbonate, levulinic acid, ferric citrate, mineral medium, and N, C/N ratio, illumination conditions (continuous or intermittent), operation mode (batch or semicontinuous), and volume exchange percentage. Bayesian optimization was applied to train and tune the models. Performance was assessed using R<sup>2</sup>, Pearson correlation, RMSE, and MAPE. CatBoost outperformed the others, showing higher predictive correlation and lower error. It was subsequently used for further optimization. Feature importance analysis identified reaction time, mineral medium concentration, and volume exchange percentage as key drivers of bioproduct synthesis. The Particle Swarm Optimization algorithm was applied to determine optimal conditions for each target compound. Under the conditions studied, predicted maximum yields were: 569 mg polyhydroxybutyrate/L, 45 mg polyhydroxyvalerate/L, 79 µmol 5-aminolevulinic acid/L, 13 mg coenzyme Q10/g dw, 7 mg carotenoids/g dw, 17 mg bacteriochlorophylls/g dw, and 2040 mg biomass/L. Optimization suggests that operating as a sequencing batch reactor and employing discontinuous illumination for most targets, along with a reduced mineral medium concentration, is beneficial. Results highlight that each bioproduct requires distinct operational settings, supporting the idea of clustering target compounds.</div></div>\",\"PeriodicalId\":258,\"journal\":{\"name\":\"Bioresource Technology\",\"volume\":\"436 \",\"pages\":\"Article 132963\"},\"PeriodicalIF\":9.7000,\"publicationDate\":\"2025-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioresource Technology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0960852425009290\",\"RegionNum\":1,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURAL ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioresource Technology","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0960852425009290","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
Modeling and optimization of bioproduct formation with purple phototrophic bacteria using machine learning
Municipal and industrial wastewater, along with organic waste, can be transformed into valuable bioproducts using purple phototrophic bacteria. This study compares the performance of three machine learning models (Random Forest, XGBoost, and CatBoost) in predicting and optimizing the formation of key bioproducts: polyhydroxybutyrate, polyhydroxyvalerate, 5-aminolevulinic acid, coenzyme Q10, carotenoids, bacteriochlorophylls, and biomass. The models were trained on a dataset compiled from previous studies, using input variables such as reaction time, concentration of organic matter, ethanol, bicarbonate, levulinic acid, ferric citrate, mineral medium, and N, C/N ratio, illumination conditions (continuous or intermittent), operation mode (batch or semicontinuous), and volume exchange percentage. Bayesian optimization was applied to train and tune the models. Performance was assessed using R2, Pearson correlation, RMSE, and MAPE. CatBoost outperformed the others, showing higher predictive correlation and lower error. It was subsequently used for further optimization. Feature importance analysis identified reaction time, mineral medium concentration, and volume exchange percentage as key drivers of bioproduct synthesis. The Particle Swarm Optimization algorithm was applied to determine optimal conditions for each target compound. Under the conditions studied, predicted maximum yields were: 569 mg polyhydroxybutyrate/L, 45 mg polyhydroxyvalerate/L, 79 µmol 5-aminolevulinic acid/L, 13 mg coenzyme Q10/g dw, 7 mg carotenoids/g dw, 17 mg bacteriochlorophylls/g dw, and 2040 mg biomass/L. Optimization suggests that operating as a sequencing batch reactor and employing discontinuous illumination for most targets, along with a reduced mineral medium concentration, is beneficial. Results highlight that each bioproduct requires distinct operational settings, supporting the idea of clustering target compounds.
期刊介绍:
Bioresource Technology publishes original articles, review articles, case studies, and short communications covering the fundamentals, applications, and management of bioresource technology. The journal seeks to advance and disseminate knowledge across various areas related to biomass, biological waste treatment, bioenergy, biotransformations, bioresource systems analysis, and associated conversion or production technologies.
Topics include:
• Biofuels: liquid and gaseous biofuels production, modeling and economics
• Bioprocesses and bioproducts: biocatalysis and fermentations
• Biomass and feedstocks utilization: bioconversion of agro-industrial residues
• Environmental protection: biological waste treatment
• Thermochemical conversion of biomass: combustion, pyrolysis, gasification, catalysis.