Pavel Pascacio , David J. Vicente , Fernando Salazar , Sonia Guerra-Rodríguez , Jorge Rodríguez-Chueca
{"title":"Predictive modeling of Enterococcus sp. removal with limited data from different advanced oxidation processes: A machine learning approach","authors":"Pavel Pascacio , David J. Vicente , Fernando Salazar , Sonia Guerra-Rodríguez , Jorge Rodríguez-Chueca","doi":"10.1016/j.jece.2024.112530","DOIUrl":null,"url":null,"abstract":"<div><p>The removal of contaminants through Advanced Oxidation Processes (AOPs) is a complex task that demands the simultaneous consideration of multiple operating parameters, such as type and concentration of oxidant and catalyst, type and intensity of radiation, composition of aqueous matrix, etc. Designing efficient AOPs often requires expensive and time-consuming laboratory experiments. To improve this process, this study proposes a Machine Learning approach based on a Random Forest (RF) model, to predict <em>Enterococcus sp.</em> concentration in wastewater treated with various AOPs, even when dealing with limited data. To assess our approach under diverse conditions, a data partitioning methodology is used to categorize the different AOPs into three distinct study cases of increasing complexity, from <span>Case I</span> to <span>Case III</span>. The evaluation of the RF model’s performance, combined with the data partitioning methodology, demonstrated its usefulness in predicting missing or additional disinfection values at any instant during the AOPs. Specifically, in <span>Case I</span>, the model excels at generalizing predictions across various AOP <em>treatments</em>, followed by <span>Case II</span> and <span>III</span>, which achieve Root Mean Squared Error (RMSE) values below or comparable to the average RMSE of <span>Case I</span> (0.72) in 8 out of 15 and 2 out of 4 <em>treatments</em>, respectively. Moreover, the effects of imbalanced data on model performance are discussed. This highlights the potential of our approach to assess AOPs performance and facilitate the design of new experiments of the same <em>treatment</em> type without the need for additional laboratory trials, even in challenging conditions.</p></div>","PeriodicalId":15759,"journal":{"name":"Journal of Environmental Chemical Engineering","volume":"12 3","pages":"Article 112530"},"PeriodicalIF":7.4000,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2213343724006602/pdfft?md5=04b56d983300b20d96324a7c973dca1d&pid=1-s2.0-S2213343724006602-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Environmental Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213343724006602","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0
Abstract
The removal of contaminants through Advanced Oxidation Processes (AOPs) is a complex task that demands the simultaneous consideration of multiple operating parameters, such as type and concentration of oxidant and catalyst, type and intensity of radiation, composition of aqueous matrix, etc. Designing efficient AOPs often requires expensive and time-consuming laboratory experiments. To improve this process, this study proposes a Machine Learning approach based on a Random Forest (RF) model, to predict Enterococcus sp. concentration in wastewater treated with various AOPs, even when dealing with limited data. To assess our approach under diverse conditions, a data partitioning methodology is used to categorize the different AOPs into three distinct study cases of increasing complexity, from Case I to Case III. The evaluation of the RF model’s performance, combined with the data partitioning methodology, demonstrated its usefulness in predicting missing or additional disinfection values at any instant during the AOPs. Specifically, in Case I, the model excels at generalizing predictions across various AOP treatments, followed by Case II and III, which achieve Root Mean Squared Error (RMSE) values below or comparable to the average RMSE of Case I (0.72) in 8 out of 15 and 2 out of 4 treatments, respectively. Moreover, the effects of imbalanced data on model performance are discussed. This highlights the potential of our approach to assess AOPs performance and facilitate the design of new experiments of the same treatment type without the need for additional laboratory trials, even in challenging conditions.
期刊介绍:
The Journal of Environmental Chemical Engineering (JECE) serves as a platform for the dissemination of original and innovative research focusing on the advancement of environmentally-friendly, sustainable technologies. JECE emphasizes the transition towards a carbon-neutral circular economy and a self-sufficient bio-based economy. Topics covered include soil, water, wastewater, and air decontamination; pollution monitoring, prevention, and control; advanced analytics, sensors, impact and risk assessment methodologies in environmental chemical engineering; resource recovery (water, nutrients, materials, energy); industrial ecology; valorization of waste streams; waste management (including e-waste); climate-water-energy-food nexus; novel materials for environmental, chemical, and energy applications; sustainability and environmental safety; water digitalization, water data science, and machine learning; process integration and intensification; recent developments in green chemistry for synthesis, catalysis, and energy; and original research on contaminants of emerging concern, persistent chemicals, and priority substances, including microplastics, nanoplastics, nanomaterials, micropollutants, antimicrobial resistance genes, and emerging pathogens (viruses, bacteria, parasites) of environmental significance.