Predictive modeling of Enterococcus sp. removal with limited data from different advanced oxidation processes: A machine learning approach

IF 7.4 2区 工程技术 Q1 ENGINEERING, CHEMICAL
Pavel Pascacio , David J. Vicente , Fernando Salazar , Sonia Guerra-Rodríguez , Jorge Rodríguez-Chueca
{"title":"Predictive modeling of Enterococcus sp. removal with limited data from different advanced oxidation processes: A machine learning approach","authors":"Pavel Pascacio ,&nbsp;David J. Vicente ,&nbsp;Fernando Salazar ,&nbsp;Sonia Guerra-Rodríguez ,&nbsp;Jorge Rodríguez-Chueca","doi":"10.1016/j.jece.2024.112530","DOIUrl":null,"url":null,"abstract":"<div><p>The removal of contaminants through Advanced Oxidation Processes (AOPs) is a complex task that demands the simultaneous consideration of multiple operating parameters, such as type and concentration of oxidant and catalyst, type and intensity of radiation, composition of aqueous matrix, etc. Designing efficient AOPs often requires expensive and time-consuming laboratory experiments. To improve this process, this study proposes a Machine Learning approach based on a Random Forest (RF) model, to predict <em>Enterococcus sp.</em> concentration in wastewater treated with various AOPs, even when dealing with limited data. To assess our approach under diverse conditions, a data partitioning methodology is used to categorize the different AOPs into three distinct study cases of increasing complexity, from <span>Case I</span> to <span>Case III</span>. The evaluation of the RF model’s performance, combined with the data partitioning methodology, demonstrated its usefulness in predicting missing or additional disinfection values at any instant during the AOPs. Specifically, in <span>Case I</span>, the model excels at generalizing predictions across various AOP <em>treatments</em>, followed by <span>Case II</span> and <span>III</span>, which achieve Root Mean Squared Error (RMSE) values below or comparable to the average RMSE of <span>Case I</span> (0.72) in 8 out of 15 and 2 out of 4 <em>treatments</em>, respectively. Moreover, the effects of imbalanced data on model performance are discussed. This highlights the potential of our approach to assess AOPs performance and facilitate the design of new experiments of the same <em>treatment</em> type without the need for additional laboratory trials, even in challenging conditions.</p></div>","PeriodicalId":15759,"journal":{"name":"Journal of Environmental Chemical Engineering","volume":"12 3","pages":"Article 112530"},"PeriodicalIF":7.4000,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2213343724006602/pdfft?md5=04b56d983300b20d96324a7c973dca1d&pid=1-s2.0-S2213343724006602-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Environmental Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213343724006602","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0

Abstract

The removal of contaminants through Advanced Oxidation Processes (AOPs) is a complex task that demands the simultaneous consideration of multiple operating parameters, such as type and concentration of oxidant and catalyst, type and intensity of radiation, composition of aqueous matrix, etc. Designing efficient AOPs often requires expensive and time-consuming laboratory experiments. To improve this process, this study proposes a Machine Learning approach based on a Random Forest (RF) model, to predict Enterococcus sp. concentration in wastewater treated with various AOPs, even when dealing with limited data. To assess our approach under diverse conditions, a data partitioning methodology is used to categorize the different AOPs into three distinct study cases of increasing complexity, from Case I to Case III. The evaluation of the RF model’s performance, combined with the data partitioning methodology, demonstrated its usefulness in predicting missing or additional disinfection values at any instant during the AOPs. Specifically, in Case I, the model excels at generalizing predictions across various AOP treatments, followed by Case II and III, which achieve Root Mean Squared Error (RMSE) values below or comparable to the average RMSE of Case I (0.72) in 8 out of 15 and 2 out of 4 treatments, respectively. Moreover, the effects of imbalanced data on model performance are discussed. This highlights the potential of our approach to assess AOPs performance and facilitate the design of new experiments of the same treatment type without the need for additional laboratory trials, even in challenging conditions.

利用不同高级氧化工艺的有限数据建立去除肠球菌的预测模型:机器学习方法
通过高级氧化工艺(AOPs)去除污染物是一项复杂的任务,需要同时考虑多种操作参数,如氧化剂和催化剂的类型和浓度、辐射类型和强度、水基质的成分等。设计高效的 AOP 通常需要昂贵而耗时的实验室实验。为了改进这一过程,本研究提出了一种基于随机森林 (RF) 模型的机器学习方法,即使在处理有限数据的情况下,也能预测经各种 AOP 处理的废水中的肠球菌浓度。为了在不同条件下评估我们的方法,我们采用了一种数据分区方法,将不同的 AOPs 分成三个不同的研究案例,从案例 I 到案例 III,复杂程度依次增加。结合数据分区方法对射频模型的性能进行的评估表明,该模型可用于预测 AOP 期间任何瞬间的缺失或额外消毒值。具体而言,在案例 I 中,该模型在不同 AOP 处理中的预测通用性表现出色,其次是案例 II 和 III,分别在 15 个处理中的 8 个和 4 个处理中的 2 个中实现了均方根误差(RMSE)值低于或与案例 I 的平均 RMSE(0.72)相当。此外,还讨论了不平衡数据对模型性能的影响。这凸显了我们的方法在评估 AOPs 性能方面的潜力,并有助于设计相同处理类型的新实验,而无需额外的实验室试验,即使在具有挑战性的条件下也是如此。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Environmental Chemical Engineering
Journal of Environmental Chemical Engineering Environmental Science-Pollution
CiteScore
11.40
自引率
6.50%
发文量
2017
审稿时长
27 days
期刊介绍: The Journal of Environmental Chemical Engineering (JECE) serves as a platform for the dissemination of original and innovative research focusing on the advancement of environmentally-friendly, sustainable technologies. JECE emphasizes the transition towards a carbon-neutral circular economy and a self-sufficient bio-based economy. Topics covered include soil, water, wastewater, and air decontamination; pollution monitoring, prevention, and control; advanced analytics, sensors, impact and risk assessment methodologies in environmental chemical engineering; resource recovery (water, nutrients, materials, energy); industrial ecology; valorization of waste streams; waste management (including e-waste); climate-water-energy-food nexus; novel materials for environmental, chemical, and energy applications; sustainability and environmental safety; water digitalization, water data science, and machine learning; process integration and intensification; recent developments in green chemistry for synthesis, catalysis, and energy; and original research on contaminants of emerging concern, persistent chemicals, and priority substances, including microplastics, nanoplastics, nanomaterials, micropollutants, antimicrobial resistance genes, and emerging pathogens (viruses, bacteria, parasites) of environmental significance.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信