A TIME SERIES KNOWLEDGE MINING FRAMEWORK EXPLOITING THE SYNERGY BETWEEN SUBSEQUENCE CLUSTERING AND PREDICTIVE MARKOVIAN MODELS

Q3 Economics, Econometrics and Finance

Fuzzy Economic Review Pub Date : 2009-01-01 DOI:10.25102/FER.2009.01.03

V. Georgescu

{"title":"A TIME SERIES KNOWLEDGE MINING FRAMEWORK EXPLOITING THE SYNERGY BETWEEN SUBSEQUENCE CLUSTERING AND PREDICTIVE MARKOVIAN MODELS","authors":"V. Georgescu","doi":"10.25102/FER.2009.01.03","DOIUrl":null,"url":null,"abstract":"This paper proposes a time series knowledge mining framework, designed to favor the synergy between subsequence time series clustering and predictive tools such as Hidden Markov Models. Many tasks for temporal data mining rely heavily on the choice of the representation scheme and the dissimilarity measure. The first part is concerned with detailed representation taxonomy for numeric and symbolic time series and comprehensive categorization of distance measures. Subsequence time series clustering methods with a sliding window are addressed in the second part and a generalization of Fuzzy C-Means algorithm based on the dynamic time warping distance is proposed as a very effective solution. This involves a shape-based distance tolerant to phase shifts in time or accelerations/decelerations along the time axis. It also allows to determine the degree to which set-defined objects, such as subsequence time series and their cluster centroids (similar in nature) differ from each other. In the third part we discuss the integration of clustering algorithms with probabilistic predictive tools, such as discrete Markov chains or hidden Markov models. We apply these techniques to clustering of non-overlapping sequences extracted from Standard and Poor’s 500 stock index historical data and we suggest different integrations with markovian models to improve the predictive power","PeriodicalId":38703,"journal":{"name":"Fuzzy Economic Review","volume":"14 1","pages":"41-66"},"PeriodicalIF":0.0000,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fuzzy Economic Review","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25102/FER.2009.01.03","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Economics, Econometrics and Finance","Score":null,"Total":0}

引用次数: 2

Abstract

This paper proposes a time series knowledge mining framework, designed to favor the synergy between subsequence time series clustering and predictive tools such as Hidden Markov Models. Many tasks for temporal data mining rely heavily on the choice of the representation scheme and the dissimilarity measure. The first part is concerned with detailed representation taxonomy for numeric and symbolic time series and comprehensive categorization of distance measures. Subsequence time series clustering methods with a sliding window are addressed in the second part and a generalization of Fuzzy C-Means algorithm based on the dynamic time warping distance is proposed as a very effective solution. This involves a shape-based distance tolerant to phase shifts in time or accelerations/decelerations along the time axis. It also allows to determine the degree to which set-defined objects, such as subsequence time series and their cluster centroids (similar in nature) differ from each other. In the third part we discuss the integration of clustering algorithms with probabilistic predictive tools, such as discrete Markov chains or hidden Markov models. We apply these techniques to clustering of non-overlapping sequences extracted from Standard and Poor’s 500 stock index historical data and we suggest different integrations with markovian models to improve the predictive power

查看原文本刊更多论文

一个利用子序列聚类和预测马尔可夫模型之间协同作用的时间序列知识挖掘框架

本文提出了一个时间序列知识挖掘框架，旨在促进子序列时间序列聚类与隐马尔可夫模型等预测工具之间的协同作用。时态数据挖掘的许多任务严重依赖于表示方案的选择和不相似度量。第一部分是数值和符号时间序列的详细表示分类和距离测度的综合分类。第二部分讨论了带滑动窗口的子序列时间序列聚类方法，提出了一种基于动态时间扭曲距离的模糊c均值算法作为一种非常有效的解决方案。这涉及到一个基于形状的距离，可以容忍时间上的相移或沿着时间轴的加速/减速。它还允许确定集合定义对象的程度，例如子序列时间序列及其聚类质心(本质上相似)彼此不同。在第三部分中，我们讨论了聚类算法与概率预测工具的集成，例如离散马尔可夫链或隐马尔可夫模型。我们将这些技术应用于从标准普尔500指数历史数据中提取的非重叠序列的聚类，并建议使用马尔可夫模型进行不同的集成以提高预测能力

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Fuzzy Economic Review Economics, Econometrics and Finance-Economics and Econometrics

CiteScore

0.40

自引率

0.00%

发文量