{"title":"A new approach for hydrograph data interpolation and outlier removal for vector autoregressive modelling: a case study from the Odra/Oder River","authors":"Michał Halicki, Tomasz Niedzielski","doi":"10.1007/s00477-024-02711-5","DOIUrl":null,"url":null,"abstract":"<p>This study presents a new approach for predicting water levels of the Odra/Oder river using vector autoregressive models (VAR). We use water level time series from 27 gauging stations, on which we interpolate no-data gaps using the LinAR method and detect outliers with two separate methods: the extreme values (EV) approach and the isolation forest (IFO) algorithm. Before removing potential outliers, we propose a hydrological evaluation based on multivariate data analysis. Finally, we consider three separate data scenarios, i.e. LinAR (no outlier rejection), EV, and IFO. VAR models for six prediction gauges were built in a moving window manner on the most recent 720 hourly water levels prior to each prediction. The analysis covered the time range from January 2016 to May 2022 and resulted in <span>\\(\\varvec{\\approx }\\)</span> 1,000,000 water level forecasts (3 scenarios x 6 gauges x 55,000 hourly time steps) with lead time of 72 h. The analysis of root mean squared error (RMSE) indicates that the VAR model performs well, especially for 24-hour predictions, with RMSE values ranging from 8 to 28 cm. The model was also found to have skills in predicting a rising limb of a hydrograph. Our numerical experiments showed the susceptibility of the VAR predictions to artefacts. The IFO method was found to detect outliers skilfully, which allowed to produce the most accurate VAR-based predictions.</p>","PeriodicalId":21987,"journal":{"name":"Stochastic Environmental Research and Risk Assessment","volume":"22 1","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Stochastic Environmental Research and Risk Assessment","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1007/s00477-024-02711-5","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0
Abstract
This study presents a new approach for predicting water levels of the Odra/Oder river using vector autoregressive models (VAR). We use water level time series from 27 gauging stations, on which we interpolate no-data gaps using the LinAR method and detect outliers with two separate methods: the extreme values (EV) approach and the isolation forest (IFO) algorithm. Before removing potential outliers, we propose a hydrological evaluation based on multivariate data analysis. Finally, we consider three separate data scenarios, i.e. LinAR (no outlier rejection), EV, and IFO. VAR models for six prediction gauges were built in a moving window manner on the most recent 720 hourly water levels prior to each prediction. The analysis covered the time range from January 2016 to May 2022 and resulted in \(\varvec{\approx }\) 1,000,000 water level forecasts (3 scenarios x 6 gauges x 55,000 hourly time steps) with lead time of 72 h. The analysis of root mean squared error (RMSE) indicates that the VAR model performs well, especially for 24-hour predictions, with RMSE values ranging from 8 to 28 cm. The model was also found to have skills in predicting a rising limb of a hydrograph. Our numerical experiments showed the susceptibility of the VAR predictions to artefacts. The IFO method was found to detect outliers skilfully, which allowed to produce the most accurate VAR-based predictions.
期刊介绍:
Stochastic Environmental Research and Risk Assessment (SERRA) will publish research papers, reviews and technical notes on stochastic and probabilistic approaches to environmental sciences and engineering, including interactions of earth and atmospheric environments with people and ecosystems. The basic idea is to bring together research papers on stochastic modelling in various fields of environmental sciences and to provide an interdisciplinary forum for the exchange of ideas, for communicating on issues that cut across disciplinary barriers, and for the dissemination of stochastic techniques used in different fields to the community of interested researchers. Original contributions will be considered dealing with modelling (theoretical and computational), measurements and instrumentation in one or more of the following topical areas:
- Spatiotemporal analysis and mapping of natural processes.
- Enviroinformatics.
- Environmental risk assessment, reliability analysis and decision making.
- Surface and subsurface hydrology and hydraulics.
- Multiphase porous media domains and contaminant transport modelling.
- Hazardous waste site characterization.
- Stochastic turbulence and random hydrodynamic fields.
- Chaotic and fractal systems.
- Random waves and seafloor morphology.
- Stochastic atmospheric and climate processes.
- Air pollution and quality assessment research.
- Modern geostatistics.
- Mechanisms of pollutant formation, emission, exposure and absorption.
- Physical, chemical and biological analysis of human exposure from single and multiple media and routes; control and protection.
- Bioinformatics.
- Probabilistic methods in ecology and population biology.
- Epidemiological investigations.
- Models using stochastic differential equations stochastic or partial differential equations.
- Hazardous waste site characterization.