{"title":"A hybrid Deep Forest surrogate model and Rime optimization algorithm (RIME) framework for groundwater contamination source identification (GCSI)","authors":"Yuanbo Ge , Jun Dong , Weihong Zhang","doi":"10.1016/j.advwatres.2025.105098","DOIUrl":null,"url":null,"abstract":"<div><div>When groundwater pollution occurs in operating-enterprises, it is often difficult to determine the source of pollution in a timely manner. In addition, the complex migration and transformation of pollutants in groundwater, coupled with the potential sparsity and noise of monitoring data, result in highly nonlinear characteristics of monitoring data, which reduces the accuracy of source identification. In order to improve the efficiency and accuracy of GCSI, we propose an integrated framework that combines Deep Forest surrogate model with the RIME, creating a high-accuracy, low-cost simulation-optimization system for precise GCSI in operating-enterprises. We establishes a high-precision groundwater numerical simulation model using practical case data, which was then used to generate the dataset required for GCSI. The objective of this work is to establish a framework, including Deep Forest surrogate model and RIME, and evaluate the efficiency and accuracy through comparing with Backpropagation Neural Networks (BPNN), Bidirectional Long Short-term neural networks (BiLSTM), and Genetic Algorithm (GA). The results indicate that the accuracy of the Deep Forest is slightly higher than that of BiLSTM, with RMSE value of 70.4469, MAE value of 37.1714, and R<sup>2</sup> value of 0.9793. BPNN requires the least amount of time, at 17.4103 s, but has the worst accuracy. It is worth noting that compared to BiLSTM, Deep Forest reduces computation time by 9.8 %. When inputting Deep Forest identified contamination source results into the groundwater numerical model, the relative error between simulated and observed contaminant concentrations is ≤15 %.</div></div>","PeriodicalId":7614,"journal":{"name":"Advances in Water Resources","volume":"205 ","pages":"Article 105098"},"PeriodicalIF":4.2000,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Water Resources","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S030917082500212X","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"WATER RESOURCES","Score":null,"Total":0}
引用次数: 0
Abstract
When groundwater pollution occurs in operating-enterprises, it is often difficult to determine the source of pollution in a timely manner. In addition, the complex migration and transformation of pollutants in groundwater, coupled with the potential sparsity and noise of monitoring data, result in highly nonlinear characteristics of monitoring data, which reduces the accuracy of source identification. In order to improve the efficiency and accuracy of GCSI, we propose an integrated framework that combines Deep Forest surrogate model with the RIME, creating a high-accuracy, low-cost simulation-optimization system for precise GCSI in operating-enterprises. We establishes a high-precision groundwater numerical simulation model using practical case data, which was then used to generate the dataset required for GCSI. The objective of this work is to establish a framework, including Deep Forest surrogate model and RIME, and evaluate the efficiency and accuracy through comparing with Backpropagation Neural Networks (BPNN), Bidirectional Long Short-term neural networks (BiLSTM), and Genetic Algorithm (GA). The results indicate that the accuracy of the Deep Forest is slightly higher than that of BiLSTM, with RMSE value of 70.4469, MAE value of 37.1714, and R2 value of 0.9793. BPNN requires the least amount of time, at 17.4103 s, but has the worst accuracy. It is worth noting that compared to BiLSTM, Deep Forest reduces computation time by 9.8 %. When inputting Deep Forest identified contamination source results into the groundwater numerical model, the relative error between simulated and observed contaminant concentrations is ≤15 %.
期刊介绍:
Advances in Water Resources provides a forum for the presentation of fundamental scientific advances in the understanding of water resources systems. The scope of Advances in Water Resources includes any combination of theoretical, computational, and experimental approaches used to advance fundamental understanding of surface or subsurface water resources systems or the interaction of these systems with the atmosphere, geosphere, biosphere, and human societies. Manuscripts involving case studies that do not attempt to reach broader conclusions, research on engineering design, applied hydraulics, or water quality and treatment, as well as applications of existing knowledge that do not advance fundamental understanding of hydrological processes, are not appropriate for Advances in Water Resources.
Examples of appropriate topical areas that will be considered include the following:
• Surface and subsurface hydrology
• Hydrometeorology
• Environmental fluid dynamics
• Ecohydrology and ecohydrodynamics
• Multiphase transport phenomena in porous media
• Fluid flow and species transport and reaction processes