Investigating machine learning and ensemble learning models in groundwater potential mapping in arid region: case study from Tan-Tan water-scarce region, Morocco
Abdessamad Jari, E. Bachaoui, Soufiane Hajaj, Achraf Khaddari, Younes Khandouch, Abderrazak El Harti, Amine Jellouli, Mustapha Namous
{"title":"Investigating machine learning and ensemble learning models in groundwater potential mapping in arid region: case study from Tan-Tan water-scarce region, Morocco","authors":"Abdessamad Jari, E. Bachaoui, Soufiane Hajaj, Achraf Khaddari, Younes Khandouch, Abderrazak El Harti, Amine Jellouli, Mustapha Namous","doi":"10.3389/frwa.2023.1305998","DOIUrl":null,"url":null,"abstract":"Groundwater resource management in arid regions has a critical importance for sustaining human activities and ecological systems. Accurate mapping of groundwater potential plays a vital role in effective water resource planning. This study investigates the effectiveness of machine learning models, including Random Forest (RF), Adaboost, K-Nearest Neighbors (KNN), and Gaussian Process in groundwater potential mapping (GWPM) in the Tan-Tan arid region, Morocco. Fourteen groundwater conditional factors were considered following multicollinearity test, including topographical, hydrological, climatic, and geological factors. Additionally, point data with 174 sites indicative of groundwater occurrences were incorporated. The groundwater inventory data underwent random partitioning into training and testing datasets at three different ratios: 55/45%, 65/35%, and 75/25%. Ultimately, a comprehensive ranking of the 13 models, encompassing both individual and ensemble models, was determined using the prioritization rank technique. The results revealed that ensemble learning (EL) models, particularly RF and Adaboost (RF-Adaboost), outperformed individual models in groundwater potential mapping. Based on accuracy assessment using the validation dataset, the RF-Adaboost EL results yielded an Area Under the Receiver Operating characteristic Curve (AUROC) and Overall Accuracy (OA) of 94.02 and 94%, respectively. Ensemble models have been effectively applied to integrate 14 factors, capturing their intricate interrelationships, and thereby enhancing the accuracy and robustness of groundwater prediction in the Tan-Tan water-scarce region. Among the natural factors, the current study identified lithology, structural elements (such as faults and tectonic lineaments), and land use as significant contributors to groundwater potential. However, the critical characteristics of the study area showing a coastal position as well as a low background in groundwater prospectivity (low borehole points) are challenging in GWPM. The findings highlight the importance of the significant factors in assessing and managing groundwater resources in arid regions. Moreover, this study makes a contribution to the management of groundwater resources by demonstrating the effectiveness of ensemble learning algorithms in the groundwater potential mapping (GWPM) in arid regions.","PeriodicalId":33801,"journal":{"name":"Frontiers in Water","volume":"253 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2023-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Water","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frwa.2023.1305998","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"WATER RESOURCES","Score":null,"Total":0}
引用次数: 0
Abstract
Groundwater resource management in arid regions has a critical importance for sustaining human activities and ecological systems. Accurate mapping of groundwater potential plays a vital role in effective water resource planning. This study investigates the effectiveness of machine learning models, including Random Forest (RF), Adaboost, K-Nearest Neighbors (KNN), and Gaussian Process in groundwater potential mapping (GWPM) in the Tan-Tan arid region, Morocco. Fourteen groundwater conditional factors were considered following multicollinearity test, including topographical, hydrological, climatic, and geological factors. Additionally, point data with 174 sites indicative of groundwater occurrences were incorporated. The groundwater inventory data underwent random partitioning into training and testing datasets at three different ratios: 55/45%, 65/35%, and 75/25%. Ultimately, a comprehensive ranking of the 13 models, encompassing both individual and ensemble models, was determined using the prioritization rank technique. The results revealed that ensemble learning (EL) models, particularly RF and Adaboost (RF-Adaboost), outperformed individual models in groundwater potential mapping. Based on accuracy assessment using the validation dataset, the RF-Adaboost EL results yielded an Area Under the Receiver Operating characteristic Curve (AUROC) and Overall Accuracy (OA) of 94.02 and 94%, respectively. Ensemble models have been effectively applied to integrate 14 factors, capturing their intricate interrelationships, and thereby enhancing the accuracy and robustness of groundwater prediction in the Tan-Tan water-scarce region. Among the natural factors, the current study identified lithology, structural elements (such as faults and tectonic lineaments), and land use as significant contributors to groundwater potential. However, the critical characteristics of the study area showing a coastal position as well as a low background in groundwater prospectivity (low borehole points) are challenging in GWPM. The findings highlight the importance of the significant factors in assessing and managing groundwater resources in arid regions. Moreover, this study makes a contribution to the management of groundwater resources by demonstrating the effectiveness of ensemble learning algorithms in the groundwater potential mapping (GWPM) in arid regions.