数据驱动的结构相似性:时间序列自适应线性逼近的距离度量

2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K) Pub Date : 2015-11-12 DOI:10.5220/0005597400670074

V. Ionescu, R. Potolea, M. Dînsoreanu

{"title":"数据驱动的结构相似性:时间序列自适应线性逼近的距离度量","authors":"V. Ionescu, R. Potolea, M. Dînsoreanu","doi":"10.5220/0005597400670074","DOIUrl":null,"url":null,"abstract":"Much effort has been invested in recent years in the problem of detecting similarity in time series. Most work focuses on the identification of exact matches through point-by-point comparisons, although in many real-world problems recurring patterns match each other only approximately. We introduce a new approach for identifying patterns in time series, which evaluates the similarity by comparing the overall structure of candidate sequences instead of focusing on the local shapes of the sequence and propose a new distance measure ABC (Area Between Curves) that is used to achieve this goal. The approach is based on a data-driven linear approximation method that is intuitive, offers a high compression ratio and adapts to the overall shape of the sequence. The similarity of candidate sequences is quantified by means of the novel distance measure, applied directly to the linear approximation of the time series. Our evaluations performed on multiple data sets show that our proposed technique outperforms similarity search approaches based on the commonly referenced Euclidean Distance in the majority of cases. The most significant improvements are obtained when applying our method to domains and data sets where matching sequences are indeed primarily determined based on the similarity of their higher-level structures.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Data driven structural similarity: A Distance measure for adaptive linear approximations of time series\",\"authors\":\"V. Ionescu, R. Potolea, M. Dînsoreanu\",\"doi\":\"10.5220/0005597400670074\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Much effort has been invested in recent years in the problem of detecting similarity in time series. Most work focuses on the identification of exact matches through point-by-point comparisons, although in many real-world problems recurring patterns match each other only approximately. We introduce a new approach for identifying patterns in time series, which evaluates the similarity by comparing the overall structure of candidate sequences instead of focusing on the local shapes of the sequence and propose a new distance measure ABC (Area Between Curves) that is used to achieve this goal. The approach is based on a data-driven linear approximation method that is intuitive, offers a high compression ratio and adapts to the overall shape of the sequence. The similarity of candidate sequences is quantified by means of the novel distance measure, applied directly to the linear approximation of the time series. Our evaluations performed on multiple data sets show that our proposed technique outperforms similarity search approaches based on the commonly referenced Euclidean Distance in the majority of cases. The most significant improvements are obtained when applying our method to domains and data sets where matching sequences are indeed primarily determined based on the similarity of their higher-level structures.\",\"PeriodicalId\":102743,\"journal\":{\"name\":\"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)\",\"volume\":\"79 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5220/0005597400670074\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0005597400670074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

近年来，人们在时间序列相似性检测问题上投入了大量的精力。大多数工作集中于通过逐点比较来确定精确匹配，尽管在许多现实世界的问题中，重复出现的模式彼此之间只是近似匹配。本文提出了一种新的识别时间序列模式的方法，即通过比较候选序列的整体结构来评估相似性，而不是关注序列的局部形状，并提出了一种新的距离度量ABC (Area Between Curves)来实现这一目标。该方法基于数据驱动的线性近似方法，该方法直观，具有高压缩比，并适应序列的整体形状。候选序列的相似性通过新的距离度量来量化，直接应用于时间序列的线性逼近。我们对多个数据集进行的评估表明，在大多数情况下，我们提出的技术优于基于常用的欧几里得距离的相似性搜索方法。当将我们的方法应用于匹配序列确实主要基于其高层结构的相似性确定的领域和数据集时，获得了最显著的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Data driven structural similarity: A Distance measure for adaptive linear approximations of time series

Much effort has been invested in recent years in the problem of detecting similarity in time series. Most work focuses on the identification of exact matches through point-by-point comparisons, although in many real-world problems recurring patterns match each other only approximately. We introduce a new approach for identifying patterns in time series, which evaluates the similarity by comparing the overall structure of candidate sequences instead of focusing on the local shapes of the sequence and propose a new distance measure ABC (Area Between Curves) that is used to achieve this goal. The approach is based on a data-driven linear approximation method that is intuitive, offers a high compression ratio and adapts to the overall shape of the sequence. The similarity of candidate sequences is quantified by means of the novel distance measure, applied directly to the linear approximation of the time series. Our evaluations performed on multiple data sets show that our proposed technique outperforms similarity search approaches based on the commonly referenced Euclidean Distance in the majority of cases. The most significant improvements are obtained when applying our method to domains and data sets where matching sequences are indeed primarily determined based on the similarity of their higher-level structures.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)

自引率

0.00%

发文量