Liuzhi Zhu , Wenxi Lu , Chengming Luo , Yaning Xu , Zibo Wang
{"title":"利用堆叠集合代用模型识别地下水污染源的集合优化器","authors":"Liuzhi Zhu , Wenxi Lu , Chengming Luo , Yaning Xu , Zibo Wang","doi":"10.1016/j.jconhyd.2024.104437","DOIUrl":null,"url":null,"abstract":"<div><div>The application of the simulation-optimization method for groundwater contamination source identification (GCSI) encounters two main challenges: the substantial time cost of calling the simulation model, and the limitations on the accuracy of identification results due to the complexity, nonlinearity, and ill-posed nature of the inverse problem. To address these issues, we have innovatively developed an inversion framework based on ensemble learning strategies. This framework comprises a stacking ensemble model (SEM), which integrates three distinct machine learning models (Extremely Randomized Trees, Adaptive Boosting, and Bidirectional Gated Recurrent Unit), and an ensemble optimizer (<em>E</em>-GKSEEFO), which combines two newly proposed swarm intelligence optimizers (Genghis Khan Shark Optimizer and Electric Eel Foraging Optimizer). Specifically, the SEM serves as a surrogate model for the groundwater numerical simulation model. Compared to the original simulation model, it significantly reduces time cost while maintaining accuracy. The <em>E</em>-GKSEEFO, functioning as the search strategy for the optimization model, greatly enhances the accuracy of the optimization results. We have verified the performance of the SEM-<em>E</em>-GKSEEFO ensemble inversion framework through two hypothetical scenarios derived from an actual coal gangue pile. The results are as follows. (1) The SEM exhibits improved fitting performance compared to single machine learning models when dealing with high-dimensional nonlinear data from GCSI. (2) The <em>E</em>-GKSEEFO achieves significantly higher accuracy in the identification results of GCSI than individual optimizers. These findings affirm the effectiveness and superiority of the proposed SEM-<em>E</em>-GKSEEFO ensemble inversion framework.</div></div>","PeriodicalId":15530,"journal":{"name":"Journal of contaminant hydrology","volume":"267 ","pages":"Article 104437"},"PeriodicalIF":3.5000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An ensemble optimizer with a stacking ensemble surrogate model for identification of groundwater contamination source\",\"authors\":\"Liuzhi Zhu , Wenxi Lu , Chengming Luo , Yaning Xu , Zibo Wang\",\"doi\":\"10.1016/j.jconhyd.2024.104437\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The application of the simulation-optimization method for groundwater contamination source identification (GCSI) encounters two main challenges: the substantial time cost of calling the simulation model, and the limitations on the accuracy of identification results due to the complexity, nonlinearity, and ill-posed nature of the inverse problem. To address these issues, we have innovatively developed an inversion framework based on ensemble learning strategies. This framework comprises a stacking ensemble model (SEM), which integrates three distinct machine learning models (Extremely Randomized Trees, Adaptive Boosting, and Bidirectional Gated Recurrent Unit), and an ensemble optimizer (<em>E</em>-GKSEEFO), which combines two newly proposed swarm intelligence optimizers (Genghis Khan Shark Optimizer and Electric Eel Foraging Optimizer). Specifically, the SEM serves as a surrogate model for the groundwater numerical simulation model. Compared to the original simulation model, it significantly reduces time cost while maintaining accuracy. The <em>E</em>-GKSEEFO, functioning as the search strategy for the optimization model, greatly enhances the accuracy of the optimization results. We have verified the performance of the SEM-<em>E</em>-GKSEEFO ensemble inversion framework through two hypothetical scenarios derived from an actual coal gangue pile. The results are as follows. (1) The SEM exhibits improved fitting performance compared to single machine learning models when dealing with high-dimensional nonlinear data from GCSI. (2) The <em>E</em>-GKSEEFO achieves significantly higher accuracy in the identification results of GCSI than individual optimizers. These findings affirm the effectiveness and superiority of the proposed SEM-<em>E</em>-GKSEEFO ensemble inversion framework.</div></div>\",\"PeriodicalId\":15530,\"journal\":{\"name\":\"Journal of contaminant hydrology\",\"volume\":\"267 \",\"pages\":\"Article 104437\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of contaminant hydrology\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169772224001414\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of contaminant hydrology","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169772224001414","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
An ensemble optimizer with a stacking ensemble surrogate model for identification of groundwater contamination source
The application of the simulation-optimization method for groundwater contamination source identification (GCSI) encounters two main challenges: the substantial time cost of calling the simulation model, and the limitations on the accuracy of identification results due to the complexity, nonlinearity, and ill-posed nature of the inverse problem. To address these issues, we have innovatively developed an inversion framework based on ensemble learning strategies. This framework comprises a stacking ensemble model (SEM), which integrates three distinct machine learning models (Extremely Randomized Trees, Adaptive Boosting, and Bidirectional Gated Recurrent Unit), and an ensemble optimizer (E-GKSEEFO), which combines two newly proposed swarm intelligence optimizers (Genghis Khan Shark Optimizer and Electric Eel Foraging Optimizer). Specifically, the SEM serves as a surrogate model for the groundwater numerical simulation model. Compared to the original simulation model, it significantly reduces time cost while maintaining accuracy. The E-GKSEEFO, functioning as the search strategy for the optimization model, greatly enhances the accuracy of the optimization results. We have verified the performance of the SEM-E-GKSEEFO ensemble inversion framework through two hypothetical scenarios derived from an actual coal gangue pile. The results are as follows. (1) The SEM exhibits improved fitting performance compared to single machine learning models when dealing with high-dimensional nonlinear data from GCSI. (2) The E-GKSEEFO achieves significantly higher accuracy in the identification results of GCSI than individual optimizers. These findings affirm the effectiveness and superiority of the proposed SEM-E-GKSEEFO ensemble inversion framework.
期刊介绍:
The Journal of Contaminant Hydrology is an international journal publishing scientific articles pertaining to the contamination of subsurface water resources. Emphasis is placed on investigations of the physical, chemical, and biological processes influencing the behavior and fate of organic and inorganic contaminants in the unsaturated (vadose) and saturated (groundwater) zones, as well as at groundwater-surface water interfaces. The ecological impacts of contaminants transported both from and to aquifers are of interest. Articles on contamination of surface water only, without a link to groundwater, are out of the scope. Broad latitude is allowed in identifying contaminants of interest, and include legacy and emerging pollutants, nutrients, nanoparticles, pathogenic microorganisms (e.g., bacteria, viruses, protozoa), microplastics, and various constituents associated with energy production (e.g., methane, carbon dioxide, hydrogen sulfide).
The journal''s scope embraces a wide range of topics including: experimental investigations of contaminant sorption, diffusion, transformation, volatilization and transport in the surface and subsurface; characterization of soil and aquifer properties only as they influence contaminant behavior; development and testing of mathematical models of contaminant behaviour; innovative techniques for restoration of contaminated sites; development of new tools or techniques for monitoring the extent of soil and groundwater contamination; transformation of contaminants in the hyporheic zone; effects of contaminants traversing the hyporheic zone on surface water and groundwater ecosystems; subsurface carbon sequestration and/or turnover; and migration of fluids associated with energy production into groundwater.