{"title":"A testing approach to clustering scalar time series","authors":"Daniel Peña, Ruey S. Tsay","doi":"10.1111/jtsa.12706","DOIUrl":null,"url":null,"abstract":"<p>This article considers clustering stationary scalar time series using their marginal properties and a hierarchical method. Two major issues involved are to detect the existence of clusters and to determine their number. We propose a new test statistic for detecting whether a data set consists of multiple clusters and a new procedure to determine the number of clusters. The proposed method is based on the jumps, that is, the increments, in the heights of the dendrogram when a hierarchical clustering is applied to the data. We use autoregressive sieve bootstrap to obtain a reference distribution of the test statistics and propose an iterative procedure to find the number of clusters. The clusters found are internally homogeneous according to the test statistics used in the analysis. The performance of the proposed procedure in finite samples is investigated by Monte Carlo simulations and illustrated by some empirical examples. Comparisons with some existing methods for selecting the number of clusters are also investigated.</p>","PeriodicalId":49973,"journal":{"name":"Journal of Time Series Analysis","volume":"44 5-6","pages":"667-685"},"PeriodicalIF":1.2000,"publicationDate":"2023-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/jtsa.12706","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Time Series Analysis","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/jtsa.12706","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 1
Abstract
This article considers clustering stationary scalar time series using their marginal properties and a hierarchical method. Two major issues involved are to detect the existence of clusters and to determine their number. We propose a new test statistic for detecting whether a data set consists of multiple clusters and a new procedure to determine the number of clusters. The proposed method is based on the jumps, that is, the increments, in the heights of the dendrogram when a hierarchical clustering is applied to the data. We use autoregressive sieve bootstrap to obtain a reference distribution of the test statistics and propose an iterative procedure to find the number of clusters. The clusters found are internally homogeneous according to the test statistics used in the analysis. The performance of the proposed procedure in finite samples is investigated by Monte Carlo simulations and illustrated by some empirical examples. Comparisons with some existing methods for selecting the number of clusters are also investigated.
期刊介绍:
During the last 30 years Time Series Analysis has become one of the most important and widely used branches of Mathematical Statistics. Its fields of application range from neurophysiology to astrophysics and it covers such well-known areas as economic forecasting, study of biological data, control systems, signal processing and communications and vibrations engineering.
The Journal of Time Series Analysis started in 1980, has since become the leading journal in its field, publishing papers on both fundamental theory and applications, as well as review papers dealing with recent advances in major areas of the subject and short communications on theoretical developments. The editorial board consists of many of the world''s leading experts in Time Series Analysis.