基于趋势特征信息粒化的时间序列模糊聚类

IF 2.7 1区 数学 Q2 COMPUTER SCIENCE, THEORY & METHODS
Bin Yu, Chongyan Wu
{"title":"基于趋势特征信息粒化的时间序列模糊聚类","authors":"Bin Yu,&nbsp;Chongyan Wu","doi":"10.1016/j.fss.2025.109522","DOIUrl":null,"url":null,"abstract":"<div><div>Clustering is a means to mine valuable information from complex and massive time series data sets, and information granulation is a new strategy to simulate human thinking and solve complex problems. The combination of the two provides a new perspective for knowledge discovery of time series. In this paper, the feature extraction of time series is carried out, and the trend feature is abstractly represented by information granulation to reduce the data scale. Then, a fuzzy C-means clustering algorithm of time series based on trend feature information granules is proposed. First, hodrick prescott (HP) filtering is used to process the raw time series data, removing noise and redundancy. Secondly, the global continuous fitting function is obtained by polynomial curve fitting (PCF) to the time series data. Thirdly, the polynomial function derivative (PFD) values of each time point are obtained by the function as the trend feature, and the time series is transformed into the trend feature series. Then, the feature sequence is segmented optimally. According to the principle of reasonable particle size, the feature sequence is represented by a group of information granule, and the dimensionality is reduced to transform the trend feature information granule (TFIG) sequence. Finally, the fuzzy clustering of time series is realized in the transformed representation space (information granule space). In the process of clustering, a trend feature information granule-based dynamic time warping (TFIG-DTW) algorithm is developed for calculating the distance of two equal-length or unequal-length granular time series, and weighted DTW barycenter averaging (wDBA) is extended to fuzzy C-means (FCM) algorithm to update cluster prototype. Finally, UCR time series database and stock data set are used as experimental objects to verify the effectiveness and superiority of fuzzy clustering method.</div></div>","PeriodicalId":55130,"journal":{"name":"Fuzzy Sets and Systems","volume":"519 ","pages":"Article 109522"},"PeriodicalIF":2.7000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fuzzy clustering of time series based on trend feature information granulation\",\"authors\":\"Bin Yu,&nbsp;Chongyan Wu\",\"doi\":\"10.1016/j.fss.2025.109522\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Clustering is a means to mine valuable information from complex and massive time series data sets, and information granulation is a new strategy to simulate human thinking and solve complex problems. The combination of the two provides a new perspective for knowledge discovery of time series. In this paper, the feature extraction of time series is carried out, and the trend feature is abstractly represented by information granulation to reduce the data scale. Then, a fuzzy C-means clustering algorithm of time series based on trend feature information granules is proposed. First, hodrick prescott (HP) filtering is used to process the raw time series data, removing noise and redundancy. Secondly, the global continuous fitting function is obtained by polynomial curve fitting (PCF) to the time series data. Thirdly, the polynomial function derivative (PFD) values of each time point are obtained by the function as the trend feature, and the time series is transformed into the trend feature series. Then, the feature sequence is segmented optimally. According to the principle of reasonable particle size, the feature sequence is represented by a group of information granule, and the dimensionality is reduced to transform the trend feature information granule (TFIG) sequence. Finally, the fuzzy clustering of time series is realized in the transformed representation space (information granule space). In the process of clustering, a trend feature information granule-based dynamic time warping (TFIG-DTW) algorithm is developed for calculating the distance of two equal-length or unequal-length granular time series, and weighted DTW barycenter averaging (wDBA) is extended to fuzzy C-means (FCM) algorithm to update cluster prototype. Finally, UCR time series database and stock data set are used as experimental objects to verify the effectiveness and superiority of fuzzy clustering method.</div></div>\",\"PeriodicalId\":55130,\"journal\":{\"name\":\"Fuzzy Sets and Systems\",\"volume\":\"519 \",\"pages\":\"Article 109522\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-07-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Fuzzy Sets and Systems\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0165011425002611\",\"RegionNum\":1,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fuzzy Sets and Systems","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165011425002611","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

摘要

聚类是从复杂、海量的时间序列数据集中挖掘有价值信息的一种手段,而信息粒化则是模拟人类思维、解决复杂问题的一种新策略。两者的结合为时间序列的知识发现提供了新的视角。本文对时间序列进行特征提取,并对趋势特征进行信息粒化抽象表示,以减小数据规模。然后,提出了一种基于趋势特征信息颗粒的时间序列模糊c均值聚类算法。首先,采用hdrick prescott (HP)滤波对原始时间序列数据进行处理,去除噪声和冗余。其次,对时间序列数据进行多项式曲线拟合(PCF)得到全局连续拟合函数;第三,利用该函数获得各时间点的多项式函数导数(PFD)值作为趋势特征,并将时间序列转换为趋势特征序列。然后,对特征序列进行最优分割。根据合理粒度的原则,将特征序列用一组信息颗粒表示,并对其进行降维变换,得到趋势特征信息颗粒序列。最后,在变换后的表示空间(信息颗粒空间)中实现时间序列的模糊聚类。在聚类过程中,提出了一种基于趋势特征信息颗粒的动态时间翘曲(tfigg -DTW)算法,用于计算两个等长或不等长的颗粒时间序列之间的距离,并将加权DTW质心平均(wDBA)算法扩展为模糊c均值(FCM)算法来更新聚类原型。最后以UCR时间序列数据库和股票数据集作为实验对象,验证了模糊聚类方法的有效性和优越性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Fuzzy clustering of time series based on trend feature information granulation
Clustering is a means to mine valuable information from complex and massive time series data sets, and information granulation is a new strategy to simulate human thinking and solve complex problems. The combination of the two provides a new perspective for knowledge discovery of time series. In this paper, the feature extraction of time series is carried out, and the trend feature is abstractly represented by information granulation to reduce the data scale. Then, a fuzzy C-means clustering algorithm of time series based on trend feature information granules is proposed. First, hodrick prescott (HP) filtering is used to process the raw time series data, removing noise and redundancy. Secondly, the global continuous fitting function is obtained by polynomial curve fitting (PCF) to the time series data. Thirdly, the polynomial function derivative (PFD) values of each time point are obtained by the function as the trend feature, and the time series is transformed into the trend feature series. Then, the feature sequence is segmented optimally. According to the principle of reasonable particle size, the feature sequence is represented by a group of information granule, and the dimensionality is reduced to transform the trend feature information granule (TFIG) sequence. Finally, the fuzzy clustering of time series is realized in the transformed representation space (information granule space). In the process of clustering, a trend feature information granule-based dynamic time warping (TFIG-DTW) algorithm is developed for calculating the distance of two equal-length or unequal-length granular time series, and weighted DTW barycenter averaging (wDBA) is extended to fuzzy C-means (FCM) algorithm to update cluster prototype. Finally, UCR time series database and stock data set are used as experimental objects to verify the effectiveness and superiority of fuzzy clustering method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Fuzzy Sets and Systems
Fuzzy Sets and Systems 数学-计算机:理论方法
CiteScore
6.50
自引率
17.90%
发文量
321
审稿时长
6.1 months
期刊介绍: Since its launching in 1978, the journal Fuzzy Sets and Systems has been devoted to the international advancement of the theory and application of fuzzy sets and systems. The theory of fuzzy sets now encompasses a well organized corpus of basic notions including (and not restricted to) aggregation operations, a generalized theory of relations, specific measures of information content, a calculus of fuzzy numbers. Fuzzy sets are also the cornerstone of a non-additive uncertainty theory, namely possibility theory, and of a versatile tool for both linguistic and numerical modeling: fuzzy rule-based systems. Numerous works now combine fuzzy concepts with other scientific disciplines as well as modern technologies. In mathematics fuzzy sets have triggered new research topics in connection with category theory, topology, algebra, analysis. Fuzzy sets are also part of a recent trend in the study of generalized measures and integrals, and are combined with statistical methods. Furthermore, fuzzy sets have strong logical underpinnings in the tradition of many-valued logics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信