Gamal AbdElNasser Allam Abouzied, Guoqiang Tang, Simon Michael Papalexiou, Martyn P. Clark, Eleonora Aruffo, Piero Di Carlo
{"title":"完成1951 - 2019年意大利中部日降水仪器数据系列","authors":"Gamal AbdElNasser Allam Abouzied, Guoqiang Tang, Simon Michael Papalexiou, Martyn P. Clark, Eleonora Aruffo, Piero Di Carlo","doi":"10.1002/gdj3.267","DOIUrl":null,"url":null,"abstract":"<p>Precipitation is a critical part of the global hydrological cycle that determines the distribution of water resources. It is also an essential meteorological variable used as input for hydroclimatic models and projections. However, precipitation data frequently lack complete series, especially at daily and sub-daily precipitation stations, which are usually large, bulky, and complex. To address this, gap filling is commonly used to produce complete hydrometeorological data series without missing values. Several gap-filling methods have been developed and improved. This study seeks to fill the gaps of 201 daily precipitation time series in Central Italy by localizing the approach used to generate the Serially Complete dataset for the Planet Earth (SC-Earth). This method combines the outcome of 15 strategies based on four various gap-filling techniques (quantile mapping, spatial interpolation, machine learning, and multi-strategy merging). These strategies employ the daily dataset of the neighbouring stations and the matched ERA5 data to estimate missing values at the target stations. Both raw data and the final serially complete station datasets (SCDs) underwent comprehensive quality control. Many accuracy indicators have been utilized to evaluate the performance of the strategies' estimations and the final SCD, such as Correlation Coefficient (CC), Root mean square error (RMSE), Relative bias (Bias %), and Kling-Gupta efficiency (KGE″). Multi-strategy merging strategy based on the Modified Kling-Gupta efficiency (MS<sub>1</sub>) shows the highest performance as an individual precipitation gap-filling strategy. However, the machine learning strategy using random forest (ML<sub>3</sub>) has the most outstanding share in the final estimates among all other strategies. In the end, the temporal–spatial performance of the final SCD is promising and depends on the pattern of the missing values (MV%). The mean values of KGE″, CC, variability (<i>α</i>), and bias term (<i>β</i>) are 0.9, 0.93, 1.064, and 4.98 × 10<sup>−7</sup>, respectively.</p>","PeriodicalId":54351,"journal":{"name":"Geoscience Data Journal","volume":"12 1","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gdj3.267","citationCount":"0","resultStr":"{\"title\":\"Completion of the Central Italy daily precipitation instrumental data series from 1951 to 2019\",\"authors\":\"Gamal AbdElNasser Allam Abouzied, Guoqiang Tang, Simon Michael Papalexiou, Martyn P. Clark, Eleonora Aruffo, Piero Di Carlo\",\"doi\":\"10.1002/gdj3.267\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Precipitation is a critical part of the global hydrological cycle that determines the distribution of water resources. It is also an essential meteorological variable used as input for hydroclimatic models and projections. However, precipitation data frequently lack complete series, especially at daily and sub-daily precipitation stations, which are usually large, bulky, and complex. To address this, gap filling is commonly used to produce complete hydrometeorological data series without missing values. Several gap-filling methods have been developed and improved. This study seeks to fill the gaps of 201 daily precipitation time series in Central Italy by localizing the approach used to generate the Serially Complete dataset for the Planet Earth (SC-Earth). This method combines the outcome of 15 strategies based on four various gap-filling techniques (quantile mapping, spatial interpolation, machine learning, and multi-strategy merging). These strategies employ the daily dataset of the neighbouring stations and the matched ERA5 data to estimate missing values at the target stations. Both raw data and the final serially complete station datasets (SCDs) underwent comprehensive quality control. Many accuracy indicators have been utilized to evaluate the performance of the strategies' estimations and the final SCD, such as Correlation Coefficient (CC), Root mean square error (RMSE), Relative bias (Bias %), and Kling-Gupta efficiency (KGE″). Multi-strategy merging strategy based on the Modified Kling-Gupta efficiency (MS<sub>1</sub>) shows the highest performance as an individual precipitation gap-filling strategy. However, the machine learning strategy using random forest (ML<sub>3</sub>) has the most outstanding share in the final estimates among all other strategies. In the end, the temporal–spatial performance of the final SCD is promising and depends on the pattern of the missing values (MV%). The mean values of KGE″, CC, variability (<i>α</i>), and bias term (<i>β</i>) are 0.9, 0.93, 1.064, and 4.98 × 10<sup>−7</sup>, respectively.</p>\",\"PeriodicalId\":54351,\"journal\":{\"name\":\"Geoscience Data Journal\",\"volume\":\"12 1\",\"pages\":\"\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gdj3.267\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Geoscience Data Journal\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/gdj3.267\",\"RegionNum\":3,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GEOSCIENCES, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geoscience Data Journal","FirstCategoryId":"89","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/gdj3.267","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
Completion of the Central Italy daily precipitation instrumental data series from 1951 to 2019
Precipitation is a critical part of the global hydrological cycle that determines the distribution of water resources. It is also an essential meteorological variable used as input for hydroclimatic models and projections. However, precipitation data frequently lack complete series, especially at daily and sub-daily precipitation stations, which are usually large, bulky, and complex. To address this, gap filling is commonly used to produce complete hydrometeorological data series without missing values. Several gap-filling methods have been developed and improved. This study seeks to fill the gaps of 201 daily precipitation time series in Central Italy by localizing the approach used to generate the Serially Complete dataset for the Planet Earth (SC-Earth). This method combines the outcome of 15 strategies based on four various gap-filling techniques (quantile mapping, spatial interpolation, machine learning, and multi-strategy merging). These strategies employ the daily dataset of the neighbouring stations and the matched ERA5 data to estimate missing values at the target stations. Both raw data and the final serially complete station datasets (SCDs) underwent comprehensive quality control. Many accuracy indicators have been utilized to evaluate the performance of the strategies' estimations and the final SCD, such as Correlation Coefficient (CC), Root mean square error (RMSE), Relative bias (Bias %), and Kling-Gupta efficiency (KGE″). Multi-strategy merging strategy based on the Modified Kling-Gupta efficiency (MS1) shows the highest performance as an individual precipitation gap-filling strategy. However, the machine learning strategy using random forest (ML3) has the most outstanding share in the final estimates among all other strategies. In the end, the temporal–spatial performance of the final SCD is promising and depends on the pattern of the missing values (MV%). The mean values of KGE″, CC, variability (α), and bias term (β) are 0.9, 0.93, 1.064, and 4.98 × 10−7, respectively.
Geoscience Data JournalGEOSCIENCES, MULTIDISCIPLINARYMETEOROLOGY-METEOROLOGY & ATMOSPHERIC SCIENCES
CiteScore
5.90
自引率
9.40%
发文量
35
审稿时长
4 weeks
期刊介绍:
Geoscience Data Journal provides an Open Access platform where scientific data can be formally published, in a way that includes scientific peer-review. Thus the dataset creator attains full credit for their efforts, while also improving the scientific record, providing version control for the community and allowing major datasets to be fully described, cited and discovered.
An online-only journal, GDJ publishes short data papers cross-linked to – and citing – datasets that have been deposited in approved data centres and awarded DOIs. The journal will also accept articles on data services, and articles which support and inform data publishing best practices.
Data is at the heart of science and scientific endeavour. The curation of data and the science associated with it is as important as ever in our understanding of the changing earth system and thereby enabling us to make future predictions. Geoscience Data Journal is working with recognised Data Centres across the globe to develop the future strategy for data publication, the recognition of the value of data and the communication and exploitation of data to the wider science and stakeholder communities.