{"title":"经典和机器学习方法在降雨数据恢复中的比较评估","authors":"Alireza Borhani Dariane, Matineh Imani Borhan","doi":"10.1007/s12665-025-12255-8","DOIUrl":null,"url":null,"abstract":"<div><p>Incorporating a comprehensive long-term hydrological data is a crucial aspect of conducting water resource management studies. This approach enhances the precision of hydrological models. This article aims to investigate and compare various classical and machine learning (ML) methods for recovering missing rainfall data. The study focuses on five mountainous basins in the Central Alborz Ranges in Iran, utilizing 30 years of data. The classical methods used in the study include arithmetic average (AA), linear regression (LR), multiple linear regression (MLR), inverse distance weighting (IDW), kriging with three different semi-variogram and normal ratio (NR) models, and a suggested linear regression-arithmetic average (LR-AA) method. The ultimate goal is to identify suitable methods for accurately recovering missing rainfall data in the studied region. Several machine learning methods were employed to restore precipitation data, such as artificial neural networks (ANN), support vector regression (SVR), M5 trees, and, as a novel approach, two types of adaptive neuro-fuzzy inference systems (ANFIS). To ensure that the selected duration does not have any potential impact, three intervals of artificial gaps have been incorporated to minimize the uncertainties in recovery period. These periods include 1990–1993, 2002–2005, and 2011–2014. In addition, a Social Choice method was coupled with the evaluation criteria to enhance the comparison process. In general, the results indicate that machine learning methods outperform than the classical approaches. For example, during the gap of 2002–2005 in the Karaj basin, the SVR method is the most effective method with RMSE, NSE and <span>\\({\\text{R}}^{2}\\)</span> criteria of 7.31 mm, 0.97, and 0.97, respectively. The proposed AA-LR method was found to perform better than AA or LR as well as most other classical methods. All methods have been thoroughly evaluated and compared using various criteria and aspects, making them a valuable reference for hydrological studies involving rainfall data recovery.</p></div>","PeriodicalId":542,"journal":{"name":"Environmental Earth Sciences","volume":"84 10","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparative assessment of classical and machine learning approaches for rainfall data restoration\",\"authors\":\"Alireza Borhani Dariane, Matineh Imani Borhan\",\"doi\":\"10.1007/s12665-025-12255-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Incorporating a comprehensive long-term hydrological data is a crucial aspect of conducting water resource management studies. This approach enhances the precision of hydrological models. This article aims to investigate and compare various classical and machine learning (ML) methods for recovering missing rainfall data. The study focuses on five mountainous basins in the Central Alborz Ranges in Iran, utilizing 30 years of data. The classical methods used in the study include arithmetic average (AA), linear regression (LR), multiple linear regression (MLR), inverse distance weighting (IDW), kriging with three different semi-variogram and normal ratio (NR) models, and a suggested linear regression-arithmetic average (LR-AA) method. The ultimate goal is to identify suitable methods for accurately recovering missing rainfall data in the studied region. Several machine learning methods were employed to restore precipitation data, such as artificial neural networks (ANN), support vector regression (SVR), M5 trees, and, as a novel approach, two types of adaptive neuro-fuzzy inference systems (ANFIS). To ensure that the selected duration does not have any potential impact, three intervals of artificial gaps have been incorporated to minimize the uncertainties in recovery period. These periods include 1990–1993, 2002–2005, and 2011–2014. In addition, a Social Choice method was coupled with the evaluation criteria to enhance the comparison process. In general, the results indicate that machine learning methods outperform than the classical approaches. For example, during the gap of 2002–2005 in the Karaj basin, the SVR method is the most effective method with RMSE, NSE and <span>\\\\({\\\\text{R}}^{2}\\\\)</span> criteria of 7.31 mm, 0.97, and 0.97, respectively. The proposed AA-LR method was found to perform better than AA or LR as well as most other classical methods. All methods have been thoroughly evaluated and compared using various criteria and aspects, making them a valuable reference for hydrological studies involving rainfall data recovery.</p></div>\",\"PeriodicalId\":542,\"journal\":{\"name\":\"Environmental Earth Sciences\",\"volume\":\"84 10\",\"pages\":\"\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmental Earth Sciences\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s12665-025-12255-8\",\"RegionNum\":4,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Earth Sciences","FirstCategoryId":"93","ListUrlMain":"https://link.springer.com/article/10.1007/s12665-025-12255-8","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
Comparative assessment of classical and machine learning approaches for rainfall data restoration
Incorporating a comprehensive long-term hydrological data is a crucial aspect of conducting water resource management studies. This approach enhances the precision of hydrological models. This article aims to investigate and compare various classical and machine learning (ML) methods for recovering missing rainfall data. The study focuses on five mountainous basins in the Central Alborz Ranges in Iran, utilizing 30 years of data. The classical methods used in the study include arithmetic average (AA), linear regression (LR), multiple linear regression (MLR), inverse distance weighting (IDW), kriging with three different semi-variogram and normal ratio (NR) models, and a suggested linear regression-arithmetic average (LR-AA) method. The ultimate goal is to identify suitable methods for accurately recovering missing rainfall data in the studied region. Several machine learning methods were employed to restore precipitation data, such as artificial neural networks (ANN), support vector regression (SVR), M5 trees, and, as a novel approach, two types of adaptive neuro-fuzzy inference systems (ANFIS). To ensure that the selected duration does not have any potential impact, three intervals of artificial gaps have been incorporated to minimize the uncertainties in recovery period. These periods include 1990–1993, 2002–2005, and 2011–2014. In addition, a Social Choice method was coupled with the evaluation criteria to enhance the comparison process. In general, the results indicate that machine learning methods outperform than the classical approaches. For example, during the gap of 2002–2005 in the Karaj basin, the SVR method is the most effective method with RMSE, NSE and \({\text{R}}^{2}\) criteria of 7.31 mm, 0.97, and 0.97, respectively. The proposed AA-LR method was found to perform better than AA or LR as well as most other classical methods. All methods have been thoroughly evaluated and compared using various criteria and aspects, making them a valuable reference for hydrological studies involving rainfall data recovery.
期刊介绍:
Environmental Earth Sciences is an international multidisciplinary journal concerned with all aspects of interaction between humans, natural resources, ecosystems, special climates or unique geographic zones, and the earth:
Water and soil contamination caused by waste management and disposal practices
Environmental problems associated with transportation by land, air, or water
Geological processes that may impact biosystems or humans
Man-made or naturally occurring geological or hydrological hazards
Environmental problems associated with the recovery of materials from the earth
Environmental problems caused by extraction of minerals, coal, and ores, as well as oil and gas, water and alternative energy sources
Environmental impacts of exploration and recultivation – Environmental impacts of hazardous materials
Management of environmental data and information in data banks and information systems
Dissemination of knowledge on techniques, methods, approaches and experiences to improve and remediate the environment
In pursuit of these topics, the geoscientific disciplines are invited to contribute their knowledge and experience. Major disciplines include: hydrogeology, hydrochemistry, geochemistry, geophysics, engineering geology, remediation science, natural resources management, environmental climatology and biota, environmental geography, soil science and geomicrobiology.