Zemin Chao;Hong Gao;Dongjing Miao;Jianzhong Li;Hongzhi Wang
{"title":"An Amortized O(1) Lower Bound for Dynamic Time Warping in Motif Discovery","authors":"Zemin Chao;Hong Gao;Dongjing Miao;Jianzhong Li;Hongzhi Wang","doi":"10.1109/TKDE.2025.3544751","DOIUrl":null,"url":null,"abstract":"Motif discovery is a critical operation for analyzing series data in many applications. Recent works demonstrate the importance of finding motifs with Dynamic Time Warping. However, existing algorithms spend most of their time in computing lower bounds of Dynamic Time Warping to filter out the unpromising candidates. Specifically, the time complexity for computing these lower bounds is <inline-formula><tex-math>$O(L)$</tex-math></inline-formula> for each pair of subsequences, where <inline-formula><tex-math>$L$</tex-math></inline-formula> is the length of the motif (subsequences). This paper proposes two new lower bounds, called <inline-formula><tex-math>$LB_{f}$</tex-math></inline-formula> and <inline-formula><tex-math>$LB_{M}$</tex-math></inline-formula>, both of them only cost amortized <inline-formula><tex-math>$O(1)$</tex-math></inline-formula> time for each pair of subsequences. On real datasets, the proposed lower bounds are at least one magnitude faster than the state-of-the-art lower bounds used in motif discovery while still keeping satisfying effectiveness. Based on these faster lower bounds, this paper designs an efficient motif discovery algorithm that significantly reduces the cost of lower bounds. The experiments conducted on real datasets show the proposed algorithm is 5.6 times faster than the state-of-the-art algorithms on average.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 5","pages":"2239-2252"},"PeriodicalIF":8.9000,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10900728/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Motif discovery is a critical operation for analyzing series data in many applications. Recent works demonstrate the importance of finding motifs with Dynamic Time Warping. However, existing algorithms spend most of their time in computing lower bounds of Dynamic Time Warping to filter out the unpromising candidates. Specifically, the time complexity for computing these lower bounds is $O(L)$ for each pair of subsequences, where $L$ is the length of the motif (subsequences). This paper proposes two new lower bounds, called $LB_{f}$ and $LB_{M}$, both of them only cost amortized $O(1)$ time for each pair of subsequences. On real datasets, the proposed lower bounds are at least one magnitude faster than the state-of-the-art lower bounds used in motif discovery while still keeping satisfying effectiveness. Based on these faster lower bounds, this paper designs an efficient motif discovery algorithm that significantly reduces the cost of lower bounds. The experiments conducted on real datasets show the proposed algorithm is 5.6 times faster than the state-of-the-art algorithms on average.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.