{"title":"The Performance of Distances Between Time Series: An In-Depth Comparison","authors":"Margarida G. M. S. Cardoso, Ana A. Martins","doi":"10.1111/exsy.70093","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>The performance of distance measures between time series has been discussed in diverse studies. Most identified performance as the accuracy resulting from the use of a specific distance in 1-Nearest Neighbour. Few studies have addressed the related computation time, and no systematic analyses of the associations between the distances' performance (1-NN-based accuracy and computation time) and the time series' characteristics have been presented yet. We propose to fill this research gap by analysing these relationships considering the following features: the training and test sets' dimensions, the time series' length, the number of classes, and the classes' separability as measured by the Average Silhouette index. This last characteristic was not mentioned in previous studies. A methodological approach is devised to compare nine distance measures, including three recently proposed combined distances (COMB and two variants). We resort to a stepwise method for multiple comparisons and deal with the experiment-wise error rate to obtain homogeneous groups of distances with indistinct performances. The CART algorithm is used to explore the relationships between accuracy values corresponding to each distance measure under study (target) and the time series characteristics (predictors). Our analyses are based on datasets from the UCR time series classification archive. We concluded that the combined distance (COMB), dynamic time warping distance (DTW), and complexity invariance distance (CID) are consistently included in the subset of best-performing distances in all experimental scenarios. The latter (CID) has a significantly lower computational cost. We determined that the classes' separability is the time series' attribute most associated with the distances' performance.</p>\n </div>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 8","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/exsy.70093","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The performance of distance measures between time series has been discussed in diverse studies. Most identified performance as the accuracy resulting from the use of a specific distance in 1-Nearest Neighbour. Few studies have addressed the related computation time, and no systematic analyses of the associations between the distances' performance (1-NN-based accuracy and computation time) and the time series' characteristics have been presented yet. We propose to fill this research gap by analysing these relationships considering the following features: the training and test sets' dimensions, the time series' length, the number of classes, and the classes' separability as measured by the Average Silhouette index. This last characteristic was not mentioned in previous studies. A methodological approach is devised to compare nine distance measures, including three recently proposed combined distances (COMB and two variants). We resort to a stepwise method for multiple comparisons and deal with the experiment-wise error rate to obtain homogeneous groups of distances with indistinct performances. The CART algorithm is used to explore the relationships between accuracy values corresponding to each distance measure under study (target) and the time series characteristics (predictors). Our analyses are based on datasets from the UCR time series classification archive. We concluded that the combined distance (COMB), dynamic time warping distance (DTW), and complexity invariance distance (CID) are consistently included in the subset of best-performing distances in all experimental scenarios. The latter (CID) has a significantly lower computational cost. We determined that the classes' separability is the time series' attribute most associated with the distances' performance.
期刊介绍:
Expert Systems: The Journal of Knowledge Engineering publishes papers dealing with all aspects of knowledge engineering, including individual methods and techniques in knowledge acquisition and representation, and their application in the construction of systems – including expert systems – based thereon. Detailed scientific evaluation is an essential part of any paper.
As well as traditional application areas, such as Software and Requirements Engineering, Human-Computer Interaction, and Artificial Intelligence, we are aiming at the new and growing markets for these technologies, such as Business, Economy, Market Research, and Medical and Health Care. The shift towards this new focus will be marked by a series of special issues covering hot and emergent topics.