Lovemore Chipindu, Walter Mupangwa, Isaiah Nyagumbo, Mainassara Zaman-Allah
{"title":"Unsupervised segmentation and clustering time series approach to Southern Africa rainfall regime changes","authors":"Lovemore Chipindu, Walter Mupangwa, Isaiah Nyagumbo, Mainassara Zaman-Allah","doi":"10.1002/gdj3.228","DOIUrl":null,"url":null,"abstract":"<p>Analysis of hydro-climatological time series and spatiotemporal dynamics of meteorological variables has become critical in the context of climate change, especially in Southern African countries where rain-fed agriculture is predominant. In this work, we compared modern unsupervised time series and segmentation approaches and commonly used time series models to analyse rainfall regime changes in the coastal, sub-humid and semi-arid regions of Southern Africa. Rainfall regimes change modelling and prediction inform farming strategies especially when choosing measures for mixed crop–livestock farming systems, as farmers can decide to do rainwater harvesting and moisture conservation or supplementary irrigation if water resources are available. The main goal of this study was to predict/identify rainfall cluster trends over time using regression with hidden logistic process (RHLP) or hidden Markov model regression (HMMR) supplemented by autoregressive integrated moving average (ARIMA) and Facebook Prophet models. Historical time series rainfall data was sourced from meteorological services departments for selected site over an average period of 55 years. Commonly used approaches forecasted an upward rainfall trend in the coastal and sub-humid regions and a declining trend in semi-arid areas with high variability between and within seasons. For all sites, Ljung-Box Test Statistics suggested the existence of autocorrelation in rainfall time series data. Prediction capabilities were investigated using the root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) which indicated not much difference between ARIMA and Facebook Prophet models. RHLP and HMMR offered a unique clustering and segmentation approach examining between and within-season rainfall variability. A maximum of 20 unique rainfall clusters with similar trend characteristics were determined as going beyond this brought non-significant difference to regime changes. A clear trend was exhibited from 1980 going backwards as compared to recent years signifying how unpredictable is rainfall in Southern Africa. The unsupervised approaches predicted a clear cluster trend in coastal than in sub-humid and semi-arid and the performance was assessed using Akaike information criteria and log-likelihood which showed improvement in prediction power as the number of segmentation clusters approaches 20.</p>","PeriodicalId":54351,"journal":{"name":"Geoscience Data Journal","volume":"11 4","pages":"514-530"},"PeriodicalIF":3.3000,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gdj3.228","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geoscience Data Journal","FirstCategoryId":"89","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/gdj3.228","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Analysis of hydro-climatological time series and spatiotemporal dynamics of meteorological variables has become critical in the context of climate change, especially in Southern African countries where rain-fed agriculture is predominant. In this work, we compared modern unsupervised time series and segmentation approaches and commonly used time series models to analyse rainfall regime changes in the coastal, sub-humid and semi-arid regions of Southern Africa. Rainfall regimes change modelling and prediction inform farming strategies especially when choosing measures for mixed crop–livestock farming systems, as farmers can decide to do rainwater harvesting and moisture conservation or supplementary irrigation if water resources are available. The main goal of this study was to predict/identify rainfall cluster trends over time using regression with hidden logistic process (RHLP) or hidden Markov model regression (HMMR) supplemented by autoregressive integrated moving average (ARIMA) and Facebook Prophet models. Historical time series rainfall data was sourced from meteorological services departments for selected site over an average period of 55 years. Commonly used approaches forecasted an upward rainfall trend in the coastal and sub-humid regions and a declining trend in semi-arid areas with high variability between and within seasons. For all sites, Ljung-Box Test Statistics suggested the existence of autocorrelation in rainfall time series data. Prediction capabilities were investigated using the root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) which indicated not much difference between ARIMA and Facebook Prophet models. RHLP and HMMR offered a unique clustering and segmentation approach examining between and within-season rainfall variability. A maximum of 20 unique rainfall clusters with similar trend characteristics were determined as going beyond this brought non-significant difference to regime changes. A clear trend was exhibited from 1980 going backwards as compared to recent years signifying how unpredictable is rainfall in Southern Africa. The unsupervised approaches predicted a clear cluster trend in coastal than in sub-humid and semi-arid and the performance was assessed using Akaike information criteria and log-likelihood which showed improvement in prediction power as the number of segmentation clusters approaches 20.
Geoscience Data JournalGEOSCIENCES, MULTIDISCIPLINARYMETEOROLOGY-METEOROLOGY & ATMOSPHERIC SCIENCES
CiteScore
5.90
自引率
9.40%
发文量
35
审稿时长
4 weeks
期刊介绍:
Geoscience Data Journal provides an Open Access platform where scientific data can be formally published, in a way that includes scientific peer-review. Thus the dataset creator attains full credit for their efforts, while also improving the scientific record, providing version control for the community and allowing major datasets to be fully described, cited and discovered.
An online-only journal, GDJ publishes short data papers cross-linked to – and citing – datasets that have been deposited in approved data centres and awarded DOIs. The journal will also accept articles on data services, and articles which support and inform data publishing best practices.
Data is at the heart of science and scientific endeavour. The curation of data and the science associated with it is as important as ever in our understanding of the changing earth system and thereby enabling us to make future predictions. Geoscience Data Journal is working with recognised Data Centres across the globe to develop the future strategy for data publication, the recognition of the value of data and the communication and exploitation of data to the wider science and stakeholder communities.