{"title":"Synthetic random environmental time series generation with similarity control, preserving original signal’s statistical characteristics","authors":"Ofek Aloni, Gal Perelman, Barak Fishbain","doi":"10.1016/j.envsoft.2024.106283","DOIUrl":null,"url":null,"abstract":"Synthetic datasets are widely used in applications like missing data imputation, simulations, training data-driven models, and system robustness analysis. Typically based on historical data, these datasets need to represent specific system behaviors while being diverse enough to challenge the system with a broad range of inputs. This paper introduces a method using discrete Fourier transform to generate synthetic time series with similar statistical moments to any given signal. The method allows control over the similarity level between the original and synthetic signals. Analytical proof shows that this method preserves the first two statistical moments and the autocorrelation function of the input signal. It is compared to ARMA, GAN, and CoSMoS methods using various environmental datasets with different temporal resolutions and domains, demonstrating its generality and flexibility. A Python library implementing this method is available as open-source software.","PeriodicalId":310,"journal":{"name":"Environmental Modelling & Software","volume":"138 11-12 1","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2024-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Modelling & Software","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1016/j.envsoft.2024.106283","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Synthetic datasets are widely used in applications like missing data imputation, simulations, training data-driven models, and system robustness analysis. Typically based on historical data, these datasets need to represent specific system behaviors while being diverse enough to challenge the system with a broad range of inputs. This paper introduces a method using discrete Fourier transform to generate synthetic time series with similar statistical moments to any given signal. The method allows control over the similarity level between the original and synthetic signals. Analytical proof shows that this method preserves the first two statistical moments and the autocorrelation function of the input signal. It is compared to ARMA, GAN, and CoSMoS methods using various environmental datasets with different temporal resolutions and domains, demonstrating its generality and flexibility. A Python library implementing this method is available as open-source software.
期刊介绍:
Environmental Modelling & Software publishes contributions, in the form of research articles, reviews and short communications, on recent advances in environmental modelling and/or software. The aim is to improve our capacity to represent, understand, predict or manage the behaviour of environmental systems at all practical scales, and to communicate those improvements to a wide scientific and professional audience.