基于趋势特征信息粒化的时间序列模糊聚类

IF 2.7 1区数学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Fuzzy Sets and Systems Pub Date : 2025-07-08 DOI:10.1016/j.fss.2025.109522

Bin Yu, Chongyan Wu

{"title":"基于趋势特征信息粒化的时间序列模糊聚类","authors":"Bin Yu, Chongyan Wu","doi":"10.1016/j.fss.2025.109522","DOIUrl":null,"url":null,"abstract":"<div><div>Clustering is a means to mine valuable information from complex and massive time series data sets, and information granulation is a new strategy to simulate human thinking and solve complex problems. The combination of the two provides a new perspective for knowledge discovery of time series. In this paper, the feature extraction of time series is carried out, and the trend feature is abstractly represented by information granulation to reduce the data scale. Then, a fuzzy C-means clustering algorithm of time series based on trend feature information granules is proposed. First, hodrick prescott (HP) filtering is used to process the raw time series data, removing noise and redundancy. Secondly, the global continuous fitting function is obtained by polynomial curve fitting (PCF) to the time series data. Thirdly, the polynomial function derivative (PFD) values of each time point are obtained by the function as the trend feature, and the time series is transformed into the trend feature series. Then, the feature sequence is segmented optimally. According to the principle of reasonable particle size, the feature sequence is represented by a group of information granule, and the dimensionality is reduced to transform the trend feature information granule (TFIG) sequence. Finally, the fuzzy clustering of time series is realized in the transformed representation space (information granule space). In the process of clustering, a trend feature information granule-based dynamic time warping (TFIG-DTW) algorithm is developed for calculating the distance of two equal-length or unequal-length granular time series, and weighted DTW barycenter averaging (wDBA) is extended to fuzzy C-means (FCM) algorithm to update cluster prototype. Finally, UCR time series database and stock data set are used as experimental objects to verify the effectiveness and superiority of fuzzy clustering method.</div></div>","PeriodicalId":55130,"journal":{"name":"Fuzzy Sets and Systems","volume":"519 ","pages":"Article 109522"},"PeriodicalIF":2.7000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fuzzy clustering of time series based on trend feature information granulation\",\"authors\":\"Bin Yu, Chongyan Wu\",\"doi\":\"10.1016/j.fss.2025.109522\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Clustering is a means to mine valuable information from complex and massive time series data sets, and information granulation is a new strategy to simulate human thinking and solve complex problems. The combination of the two provides a new perspective for knowledge discovery of time series. In this paper, the feature extraction of time series is carried out, and the trend feature is abstractly represented by information granulation to reduce the data scale. Then, a fuzzy C-means clustering algorithm of time series based on trend feature information granules is proposed. First, hodrick prescott (HP) filtering is used to process the raw time series data, removing noise and redundancy. Secondly, the global continuous fitting function is obtained by polynomial curve fitting (PCF) to the time series data. Thirdly, the polynomial function derivative (PFD) values of each time point are obtained by the function as the trend feature, and the time series is transformed into the trend feature series. Then, the feature sequence is segmented optimally. According to the principle of reasonable particle size, the feature sequence is represented by a group of information granule, and the dimensionality is reduced to transform the trend feature information granule (TFIG) sequence. Finally, the fuzzy clustering of time series is realized in the transformed representation space (information granule space). In the process of clustering, a trend feature information granule-based dynamic time warping (TFIG-DTW) algorithm is developed for calculating the distance of two equal-length or unequal-length granular time series, and weighted DTW barycenter averaging (wDBA) is extended to fuzzy C-means (FCM) algorithm to update cluster prototype. Finally, UCR time series database and stock data set are used as experimental objects to verify the effectiveness and superiority of fuzzy clustering method.</div></div>\",\"PeriodicalId\":55130,\"journal\":{\"name\":\"Fuzzy Sets and Systems\",\"volume\":\"519 \",\"pages\":\"Article 109522\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-07-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Fuzzy Sets and Systems\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0165011425002611\",\"RegionNum\":1,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fuzzy Sets and Systems","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165011425002611","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

聚类是从复杂、海量的时间序列数据集中挖掘有价值信息的一种手段，而信息粒化则是模拟人类思维、解决复杂问题的一种新策略。两者的结合为时间序列的知识发现提供了新的视角。本文对时间序列进行特征提取，并对趋势特征进行信息粒化抽象表示，以减小数据规模。然后，提出了一种基于趋势特征信息颗粒的时间序列模糊c均值聚类算法。首先，采用hdrick prescott （HP）滤波对原始时间序列数据进行处理，去除噪声和冗余。其次，对时间序列数据进行多项式曲线拟合（PCF）得到全局连续拟合函数；第三，利用该函数获得各时间点的多项式函数导数（PFD）值作为趋势特征，并将时间序列转换为趋势特征序列。然后，对特征序列进行最优分割。根据合理粒度的原则，将特征序列用一组信息颗粒表示，并对其进行降维变换，得到趋势特征信息颗粒序列。最后，在变换后的表示空间（信息颗粒空间）中实现时间序列的模糊聚类。在聚类过程中，提出了一种基于趋势特征信息颗粒的动态时间翘曲（tfigg -DTW）算法，用于计算两个等长或不等长的颗粒时间序列之间的距离，并将加权DTW质心平均（wDBA）算法扩展为模糊c均值（FCM）算法来更新聚类原型。最后以UCR时间序列数据库和股票数据集作为实验对象，验证了模糊聚类方法的有效性和优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fuzzy clustering of time series based on trend feature information granulation

Clustering is a means to mine valuable information from complex and massive time series data sets, and information granulation is a new strategy to simulate human thinking and solve complex problems. The combination of the two provides a new perspective for knowledge discovery of time series. In this paper, the feature extraction of time series is carried out, and the trend feature is abstractly represented by information granulation to reduce the data scale. Then, a fuzzy C-means clustering algorithm of time series based on trend feature information granules is proposed. First, hodrick prescott (HP) filtering is used to process the raw time series data, removing noise and redundancy. Secondly, the global continuous fitting function is obtained by polynomial curve fitting (PCF) to the time series data. Thirdly, the polynomial function derivative (PFD) values of each time point are obtained by the function as the trend feature, and the time series is transformed into the trend feature series. Then, the feature sequence is segmented optimally. According to the principle of reasonable particle size, the feature sequence is represented by a group of information granule, and the dimensionality is reduced to transform the trend feature information granule (TFIG) sequence. Finally, the fuzzy clustering of time series is realized in the transformed representation space (information granule space). In the process of clustering, a trend feature information granule-based dynamic time warping (TFIG-DTW) algorithm is developed for calculating the distance of two equal-length or unequal-length granular time series, and weighted DTW barycenter averaging (wDBA) is extended to fuzzy C-means (FCM) algorithm to update cluster prototype. Finally, UCR time series database and stock data set are used as experimental objects to verify the effectiveness and superiority of fuzzy clustering method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Fuzzy Sets and Systems 数学-计算机：理论方法

CiteScore

6.50

自引率

17.90%

发文量

321

审稿时长

6.1 months

期刊介绍： Since its launching in 1978, the journal Fuzzy Sets and Systems has been devoted to the international advancement of the theory and application of fuzzy sets and systems. The theory of fuzzy sets now encompasses a well organized corpus of basic notions including (and not restricted to) aggregation operations, a generalized theory of relations, specific measures of information content, a calculus of fuzzy numbers. Fuzzy sets are also the cornerstone of a non-additive uncertainty theory, namely possibility theory, and of a versatile tool for both linguistic and numerical modeling: fuzzy rule-based systems. Numerous works now combine fuzzy concepts with other scientific disciplines as well as modern technologies. In mathematics fuzzy sets have triggered new research topics in connection with category theory, topology, algebra, analysis. Fuzzy sets are also part of a recent trend in the study of generalized measures and integrals, and are combined with statistical methods. Furthermore, fuzzy sets have strong logical underpinnings in the tradition of many-valued logics.