利用能量距离对多变量时间序列进行聚类

IF 1.2 4区 数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS
Richard A. Davis, Leon Fernandes, Konstantinos Fokianos
{"title":"利用能量距离对多变量时间序列进行聚类","authors":"Richard A. Davis,&nbsp;Leon Fernandes,&nbsp;Konstantinos Fokianos","doi":"10.1111/jtsa.12688","DOIUrl":null,"url":null,"abstract":"<p>A novel methodology is proposed for clustering multivariate time series data using energy distance defined in Székely and Rizzo (2013). Specifically, a dissimilarity matrix is formed using the energy distance statistic to measure the separation between the finite-dimensional distributions for the component time series. Once the pairwise dissimilarity matrix is calculated, a hierarchical clustering method is then applied to obtain the dendrogram. This procedure is completely nonparametric as the dissimilarities between stationary distributions are directly calculated without making any model assumptions. In order to justify this procedure, asymptotic properties of the energy distance estimates are derived for general stationary and ergodic time series. The method is illustrated in a simulation study for various component time series that are either linear or nonlinear. Finally, the methodology is applied to two examples; one involves the GDP of selected countries and the other is the population size of various states in the U.S.A. in the years 1900–1999.</p>","PeriodicalId":49973,"journal":{"name":"Journal of Time Series Analysis","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Clustering multivariate time series using energy distance\",\"authors\":\"Richard A. Davis,&nbsp;Leon Fernandes,&nbsp;Konstantinos Fokianos\",\"doi\":\"10.1111/jtsa.12688\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>A novel methodology is proposed for clustering multivariate time series data using energy distance defined in Székely and Rizzo (2013). Specifically, a dissimilarity matrix is formed using the energy distance statistic to measure the separation between the finite-dimensional distributions for the component time series. Once the pairwise dissimilarity matrix is calculated, a hierarchical clustering method is then applied to obtain the dendrogram. This procedure is completely nonparametric as the dissimilarities between stationary distributions are directly calculated without making any model assumptions. In order to justify this procedure, asymptotic properties of the energy distance estimates are derived for general stationary and ergodic time series. The method is illustrated in a simulation study for various component time series that are either linear or nonlinear. Finally, the methodology is applied to two examples; one involves the GDP of selected countries and the other is the population size of various states in the U.S.A. in the years 1900–1999.</p>\",\"PeriodicalId\":49973,\"journal\":{\"name\":\"Journal of Time Series Analysis\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2023-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Time Series Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/jtsa.12688\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Time Series Analysis","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/jtsa.12688","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

提出了一种新的方法,用于使用Székely和Rizzo(2013)中定义的能量距离对多变量时间序列数据进行聚类。具体而言,使用能量距离统计量形成相异矩阵,以测量分量时间序列的有限维分布之间的间隔。一旦计算出成对相异度矩阵,就应用层次聚类方法来获得树状图。该过程是完全非参数的,因为在不进行任何模型假设的情况下直接计算平稳分布之间的相异性。为了证明这一过程的合理性,导出了一般平稳和遍历时间序列能量距离估计的渐近性质。该方法在线性或非线性的各种分量时间序列的仿真研究中得到了说明。最后,将该方法应用于两个实例;一个涉及选定国家的国内生产总值,另一个是1900年至1999年美国各州的人口规模。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Clustering multivariate time series using energy distance

A novel methodology is proposed for clustering multivariate time series data using energy distance defined in Székely and Rizzo (2013). Specifically, a dissimilarity matrix is formed using the energy distance statistic to measure the separation between the finite-dimensional distributions for the component time series. Once the pairwise dissimilarity matrix is calculated, a hierarchical clustering method is then applied to obtain the dendrogram. This procedure is completely nonparametric as the dissimilarities between stationary distributions are directly calculated without making any model assumptions. In order to justify this procedure, asymptotic properties of the energy distance estimates are derived for general stationary and ergodic time series. The method is illustrated in a simulation study for various component time series that are either linear or nonlinear. Finally, the methodology is applied to two examples; one involves the GDP of selected countries and the other is the population size of various states in the U.S.A. in the years 1900–1999.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Time Series Analysis
Journal of Time Series Analysis 数学-数学跨学科应用
CiteScore
2.00
自引率
0.00%
发文量
39
审稿时长
6-12 weeks
期刊介绍: During the last 30 years Time Series Analysis has become one of the most important and widely used branches of Mathematical Statistics. Its fields of application range from neurophysiology to astrophysics and it covers such well-known areas as economic forecasting, study of biological data, control systems, signal processing and communications and vibrations engineering. The Journal of Time Series Analysis started in 1980, has since become the leading journal in its field, publishing papers on both fundamental theory and applications, as well as review papers dealing with recent advances in major areas of the subject and short communications on theoretical developments. The editorial board consists of many of the world''s leading experts in Time Series Analysis.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信