Richard A. Davis, Leon Fernandes, Konstantinos Fokianos
{"title":"Clustering multivariate time series using energy distance","authors":"Richard A. Davis, Leon Fernandes, Konstantinos Fokianos","doi":"10.1111/jtsa.12688","DOIUrl":null,"url":null,"abstract":"<p>A novel methodology is proposed for clustering multivariate time series data using energy distance defined in Székely and Rizzo (2013). Specifically, a dissimilarity matrix is formed using the energy distance statistic to measure the separation between the finite-dimensional distributions for the component time series. Once the pairwise dissimilarity matrix is calculated, a hierarchical clustering method is then applied to obtain the dendrogram. This procedure is completely nonparametric as the dissimilarities between stationary distributions are directly calculated without making any model assumptions. In order to justify this procedure, asymptotic properties of the energy distance estimates are derived for general stationary and ergodic time series. The method is illustrated in a simulation study for various component time series that are either linear or nonlinear. Finally, the methodology is applied to two examples; one involves the GDP of selected countries and the other is the population size of various states in the U.S.A. in the years 1900–1999.</p>","PeriodicalId":49973,"journal":{"name":"Journal of Time Series Analysis","volume":"44 5-6","pages":"487-504"},"PeriodicalIF":1.2000,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Time Series Analysis","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/jtsa.12688","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
A novel methodology is proposed for clustering multivariate time series data using energy distance defined in Székely and Rizzo (2013). Specifically, a dissimilarity matrix is formed using the energy distance statistic to measure the separation between the finite-dimensional distributions for the component time series. Once the pairwise dissimilarity matrix is calculated, a hierarchical clustering method is then applied to obtain the dendrogram. This procedure is completely nonparametric as the dissimilarities between stationary distributions are directly calculated without making any model assumptions. In order to justify this procedure, asymptotic properties of the energy distance estimates are derived for general stationary and ergodic time series. The method is illustrated in a simulation study for various component time series that are either linear or nonlinear. Finally, the methodology is applied to two examples; one involves the GDP of selected countries and the other is the population size of various states in the U.S.A. in the years 1900–1999.
期刊介绍:
During the last 30 years Time Series Analysis has become one of the most important and widely used branches of Mathematical Statistics. Its fields of application range from neurophysiology to astrophysics and it covers such well-known areas as economic forecasting, study of biological data, control systems, signal processing and communications and vibrations engineering.
The Journal of Time Series Analysis started in 1980, has since become the leading journal in its field, publishing papers on both fundamental theory and applications, as well as review papers dealing with recent advances in major areas of the subject and short communications on theoretical developments. The editorial board consists of many of the world''s leading experts in Time Series Analysis.