Massive feature extraction for explaining and foretelling hydroclimatic time series forecastability at the global scale

IF 8.5 1区 地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY
Georgia Papacharalampous , Hristos Tyralis , Ilias G. Pechlivanidis , Salvatore Grimaldi , Elena Volpi
{"title":"Massive feature extraction for explaining and foretelling hydroclimatic time series forecastability at the global scale","authors":"Georgia Papacharalampous ,&nbsp;Hristos Tyralis ,&nbsp;Ilias G. Pechlivanidis ,&nbsp;Salvatore Grimaldi ,&nbsp;Elena Volpi","doi":"10.1016/j.gsf.2022.101349","DOIUrl":null,"url":null,"abstract":"<div><p>Statistical analyses and descriptive characterizations are sometimes assumed to be offering information on time series forecastability. Despite the scientific interest suggested by such assumptions, the relationships between descriptive time series features (e.g., temporal dependence, entropy, seasonality, trend and linearity features) and actual time series forecastability (quantified by issuing and assessing forecasts for the past) are scarcely studied and quantified in the literature. In this work, we aim to fill in this gap by investigating such relationships, and the way that they can be exploited for understanding hydroclimatic forecastability and its patterns. To this end, we follow a systematic framework bringing together a variety of –mostly new for hydrology– concepts and methods, including 57 descriptive features and nine seasonal time series forecasting methods (i.e., one simple, five exponential smoothing, two state space and one automated autoregressive fractionally integrated moving average methods). We apply this framework to three global datasets originating from the larger Global Historical Climatology Network (GHCN) and Global Streamflow Indices and Metadata (GSIM) archives. As these datasets comprise over 13,000 monthly temperature, precipitation and river flow time series from several continents and hydroclimatic regimes, they allow us to provide trustable characterizations and interpretations of 12-month ahead hydroclimatic forecastability at the global scale. We first find that the exponential smoothing and state space methods for time series forecasting are rather equally efficient in identifying an upper limit of this forecastability in terms of Nash-Sutcliffe efficiency, while the simple method is shown to be mostly useful in identifying its lower limit. We then demonstrate that the assessed forecastability is strongly related to several descriptive features, including seasonality, entropy, (partial) autocorrelation, stability, (non)linearity, spikiness and heterogeneity features, among others. We further (i) show that, if such descriptive information is available for a monthly hydroclimatic time series, we can even foretell the quality of its future forecasts with a considerable degree of confidence, and (ii) rank the features according to their efficiency in explaining and foretelling forecastability. We believe that the obtained rankings are of key importance for understanding forecastability. Spatial forecastability patterns are also revealed through our experiments, with East Asia (Europe) being characterized by larger (smaller) monthly temperature time series forecastability and the Indian subcontinent (Australia) being characterized by larger (smaller) monthly precipitation time series forecastability, compared to other continental-scale regions, and less notable differences characterizing monthly river flow from continent to continent. A comprehensive interpretation of such patters through massive feature extraction and feature-based time series clustering is shown to be possible. Indeed, continental-scale regions characterized by different degrees of forecastability are also attributed to different clusters or mixtures of clusters (because of their essential differences in terms of descriptive features).</p></div>","PeriodicalId":12711,"journal":{"name":"Geoscience frontiers","volume":"13 3","pages":"Article 101349"},"PeriodicalIF":8.5000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1674987122000020/pdfft?md5=d55f9255b411c9fc52af3417621dce76&pid=1-s2.0-S1674987122000020-main.pdf","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geoscience frontiers","FirstCategoryId":"1089","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1674987122000020","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 8

Abstract

Statistical analyses and descriptive characterizations are sometimes assumed to be offering information on time series forecastability. Despite the scientific interest suggested by such assumptions, the relationships between descriptive time series features (e.g., temporal dependence, entropy, seasonality, trend and linearity features) and actual time series forecastability (quantified by issuing and assessing forecasts for the past) are scarcely studied and quantified in the literature. In this work, we aim to fill in this gap by investigating such relationships, and the way that they can be exploited for understanding hydroclimatic forecastability and its patterns. To this end, we follow a systematic framework bringing together a variety of –mostly new for hydrology– concepts and methods, including 57 descriptive features and nine seasonal time series forecasting methods (i.e., one simple, five exponential smoothing, two state space and one automated autoregressive fractionally integrated moving average methods). We apply this framework to three global datasets originating from the larger Global Historical Climatology Network (GHCN) and Global Streamflow Indices and Metadata (GSIM) archives. As these datasets comprise over 13,000 monthly temperature, precipitation and river flow time series from several continents and hydroclimatic regimes, they allow us to provide trustable characterizations and interpretations of 12-month ahead hydroclimatic forecastability at the global scale. We first find that the exponential smoothing and state space methods for time series forecasting are rather equally efficient in identifying an upper limit of this forecastability in terms of Nash-Sutcliffe efficiency, while the simple method is shown to be mostly useful in identifying its lower limit. We then demonstrate that the assessed forecastability is strongly related to several descriptive features, including seasonality, entropy, (partial) autocorrelation, stability, (non)linearity, spikiness and heterogeneity features, among others. We further (i) show that, if such descriptive information is available for a monthly hydroclimatic time series, we can even foretell the quality of its future forecasts with a considerable degree of confidence, and (ii) rank the features according to their efficiency in explaining and foretelling forecastability. We believe that the obtained rankings are of key importance for understanding forecastability. Spatial forecastability patterns are also revealed through our experiments, with East Asia (Europe) being characterized by larger (smaller) monthly temperature time series forecastability and the Indian subcontinent (Australia) being characterized by larger (smaller) monthly precipitation time series forecastability, compared to other continental-scale regions, and less notable differences characterizing monthly river flow from continent to continent. A comprehensive interpretation of such patters through massive feature extraction and feature-based time series clustering is shown to be possible. Indeed, continental-scale regions characterized by different degrees of forecastability are also attributed to different clusters or mixtures of clusters (because of their essential differences in terms of descriptive features).

Abstract Image

大规模特征提取用于解释和预测全球尺度的水文气候时间序列可预测性
统计分析和描述性描述有时被认为是提供关于时间序列可预测性的信息。尽管这些假设具有科学意义,但描述性时间序列特征(如时间依赖性、熵、季节性、趋势和线性特征)与实际时间序列可预测性(通过发布和评估过去的预测来量化)之间的关系在文献中几乎没有研究和量化。在这项工作中,我们的目标是通过调查这种关系来填补这一空白,以及它们可以用于理解水文气候可预测性及其模式的方式。为此,我们遵循一个系统的框架,汇集了各种各样的-大多数是新的水文学-概念和方法,包括57个描述性特征和9个季节性时间序列预测方法(即,一个简单,五个指数平滑,两个状态空间和一个自动自回归分数积分移动平均方法)。我们将这一框架应用于三个全球数据集,这些数据集来自更大的全球历史气候学网络(GHCN)和全球流量指数和元数据(GSIM)档案。由于这些数据集包括来自几个大陆和水文气候制度的超过13,000个月的温度,降水和河流流量时间序列,它们使我们能够在全球范围内提供12个月前水文气候可预测性的可靠特征和解释。我们首先发现,时间序列预测的指数平滑和状态空间方法在确定这种可预测性的Nash-Sutcliffe效率的上限方面相当有效,而简单的方法被证明在确定其下限方面最有用。然后,我们证明了评估的可预测性与几个描述性特征密切相关,包括季节性、熵、(部分)自相关、稳定性、(非线性)线性、尖峰性和异质性特征等。我们进一步(i)表明,如果这些描述性信息可用于每月水文气候时间序列,我们甚至可以相当有信心地预测其未来预测的质量,并且(ii)根据其解释和预测可预测性的效率对特征进行排序。我们认为,获得的排名对于理解可预测性至关重要。与其他大陆尺度区域相比,东亚(欧洲)的月温度时间序列可预测性更大(更小),印度次大陆(澳大利亚)的月降水时间序列可预测性更大(更小),而各大陆间的月河流流量差异不显著。通过大量特征提取和基于特征的时间序列聚类来全面解释这种模式是可能的。事实上,具有不同可预测性程度特征的大陆尺度区域也可归因于不同的集群或集群的混合(因为它们在描述特征方面存在本质差异)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Geoscience frontiers
Geoscience frontiers Earth and Planetary Sciences-General Earth and Planetary Sciences
CiteScore
17.80
自引率
3.40%
发文量
147
审稿时长
35 days
期刊介绍: Geoscience Frontiers (GSF) is the Journal of China University of Geosciences (Beijing) and Peking University. It publishes peer-reviewed research articles and reviews in interdisciplinary fields of Earth and Planetary Sciences. GSF covers various research areas including petrology and geochemistry, lithospheric architecture and mantle dynamics, global tectonics, economic geology and fuel exploration, geophysics, stratigraphy and paleontology, environmental and engineering geology, astrogeology, and the nexus of resources-energy-emissions-climate under Sustainable Development Goals. The journal aims to bridge innovative, provocative, and challenging concepts and models in these fields, providing insights on correlations and evolution.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信