Phudit Thanakulkairid, Tanupat Trakulthongchai, Naruesorn Prabpon, Pat Vatiwutipong
{"title":"Efficiency of Time Series Clustering Method Based on Distribution of Difference Using Several Distances","authors":"Phudit Thanakulkairid, Tanupat Trakulthongchai, Naruesorn Prabpon, Pat Vatiwutipong","doi":"10.1109/jcsse54890.2022.9836279","DOIUrl":null,"url":null,"abstract":"Clustering is a machine learning method widely used in time series analysis. In this work, we cluster time series by applying four distance functions: Euclidean distance, Kullback-Leibler divergence, Wasserstein distance, and dynamic time warping. We consider the distribution of the first-order difference of time series and compare time series using such distributions under each of the four distances. Then, we model each time series as a vertex of a graph and the distance between each pair of time series as a weighted edge. Graph partitioning is performed as a clustering method. The advantages and drawbacks of each method are discussed. The experimental results show that Euclidean distance and Kullback-Leibler divergence perform better and more efficient clustering than the other two.","PeriodicalId":284735,"journal":{"name":"2022 19th International Joint Conference on Computer Science and Software Engineering (JCSSE)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 19th International Joint Conference on Computer Science and Software Engineering (JCSSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/jcsse54890.2022.9836279","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Clustering is a machine learning method widely used in time series analysis. In this work, we cluster time series by applying four distance functions: Euclidean distance, Kullback-Leibler divergence, Wasserstein distance, and dynamic time warping. We consider the distribution of the first-order difference of time series and compare time series using such distributions under each of the four distances. Then, we model each time series as a vertex of a graph and the distance between each pair of time series as a weighted edge. Graph partitioning is performed as a clustering method. The advantages and drawbacks of each method are discussed. The experimental results show that Euclidean distance and Kullback-Leibler divergence perform better and more efficient clustering than the other two.