Phongsakorn Sathianwiriyakhun, Thapanan Janyalikit, C. Ratanamahatana
{"title":"快速准确的模板平均时间序列分类","authors":"Phongsakorn Sathianwiriyakhun, Thapanan Janyalikit, C. Ratanamahatana","doi":"10.1109/KST.2016.7440530","DOIUrl":null,"url":null,"abstract":"Time series data are evidently ubiquitous, as we could see them in all kinds of domains and applications. As a result, various data mining tasks are often performed to discover useful knowledge, including commonly performed tasks like time series classification and clustering. Dynamic Time Warping (DTW) is accepted as one of the best available similarity measures, which has been used for distance calculation in both classification and clustering algorithms. However, its known drawback is its exceedingly high computational cost. Recently, data condensation method through template averaging is applied; each class of data can be represented by one template which could greatly speed up the classification with DTW especially in large datasets, with the trade off in lower classification accuracies. Subsequently, various attempts have been made to increase the number of representative templates to boost up the accuracies while keeping the computation complexity not too high. However, those algorithms still suffer from many predefined and hard-to-set parameters, while some require high computation time for high accuracy results. Therefore, in this work, we propose an accurate yet simple template averaging method that is parameter free and has much less computation time. The experiment results on 20 UCR time series benchmark datasets demonstrate that our proposed method can achieve a few orders of magnitude speedup while maintaining high classification accuracies.","PeriodicalId":350687,"journal":{"name":"2016 8th International Conference on Knowledge and Smart Technology (KST)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Fast and accurate template averaging for time series classification\",\"authors\":\"Phongsakorn Sathianwiriyakhun, Thapanan Janyalikit, C. Ratanamahatana\",\"doi\":\"10.1109/KST.2016.7440530\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Time series data are evidently ubiquitous, as we could see them in all kinds of domains and applications. As a result, various data mining tasks are often performed to discover useful knowledge, including commonly performed tasks like time series classification and clustering. Dynamic Time Warping (DTW) is accepted as one of the best available similarity measures, which has been used for distance calculation in both classification and clustering algorithms. However, its known drawback is its exceedingly high computational cost. Recently, data condensation method through template averaging is applied; each class of data can be represented by one template which could greatly speed up the classification with DTW especially in large datasets, with the trade off in lower classification accuracies. Subsequently, various attempts have been made to increase the number of representative templates to boost up the accuracies while keeping the computation complexity not too high. However, those algorithms still suffer from many predefined and hard-to-set parameters, while some require high computation time for high accuracy results. Therefore, in this work, we propose an accurate yet simple template averaging method that is parameter free and has much less computation time. The experiment results on 20 UCR time series benchmark datasets demonstrate that our proposed method can achieve a few orders of magnitude speedup while maintaining high classification accuracies.\",\"PeriodicalId\":350687,\"journal\":{\"name\":\"2016 8th International Conference on Knowledge and Smart Technology (KST)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 8th International Conference on Knowledge and Smart Technology (KST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/KST.2016.7440530\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 8th International Conference on Knowledge and Smart Technology (KST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KST.2016.7440530","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18
摘要
时间序列数据显然无处不在,我们可以在各种领域和应用中看到它们。因此,经常执行各种数据挖掘任务来发现有用的知识,包括时间序列分类和聚类等常用任务。动态时间翘曲(Dynamic Time Warping, DTW)是公认的最佳相似性度量之一,在分类算法和聚类算法中都被用于距离计算。然而,其已知的缺点是其极高的计算成本。近年来,采用模板平均的数据凝聚方法;每一类数据都可以用一个模板表示,这可以极大地加快DTW的分类速度,特别是在大型数据集中,但代价是分类精度较低。随后,人们进行了各种尝试,以增加代表性模板的数量,以提高精度,同时保持计算复杂度不太高。然而,这些算法仍然有许多预定义的和难以设置的参数,而有些算法需要高计算时间才能获得高精度的结果。因此,在这项工作中,我们提出了一种精确而简单的模板平均方法,该方法无参数且计算时间少得多。在20个UCR时间序列基准数据集上的实验结果表明,该方法可以在保持较高分类精度的同时实现几个数量级的加速。
Fast and accurate template averaging for time series classification
Time series data are evidently ubiquitous, as we could see them in all kinds of domains and applications. As a result, various data mining tasks are often performed to discover useful knowledge, including commonly performed tasks like time series classification and clustering. Dynamic Time Warping (DTW) is accepted as one of the best available similarity measures, which has been used for distance calculation in both classification and clustering algorithms. However, its known drawback is its exceedingly high computational cost. Recently, data condensation method through template averaging is applied; each class of data can be represented by one template which could greatly speed up the classification with DTW especially in large datasets, with the trade off in lower classification accuracies. Subsequently, various attempts have been made to increase the number of representative templates to boost up the accuracies while keeping the computation complexity not too high. However, those algorithms still suffer from many predefined and hard-to-set parameters, while some require high computation time for high accuracy results. Therefore, in this work, we propose an accurate yet simple template averaging method that is parameter free and has much less computation time. The experiment results on 20 UCR time series benchmark datasets demonstrate that our proposed method can achieve a few orders of magnitude speedup while maintaining high classification accuracies.