Series2vec: similarity-based self-supervised representation learning for time series classification

IF 4.3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Data Mining and Knowledge Discovery Pub Date : 2024-06-20 DOI:10.1007/s10618-024-01043-w

Navid Mohammadi Foumani, Chang Wei Tan, Geoffrey I. Webb, Hamid Rezatofighi, Mahsa Salehi

{"title":"Series2vec: similarity-based self-supervised representation learning for time series classification","authors":"Navid Mohammadi Foumani, Chang Wei Tan, Geoffrey I. Webb, Hamid Rezatofighi, Mahsa Salehi","doi":"10.1007/s10618-024-01043-w","DOIUrl":null,"url":null,"abstract":"<p>We argue that time series analysis is fundamentally different in nature to either vision or natural language processing with respect to the forms of meaningful self-supervised learning tasks that can be defined. Motivated by this insight, we introduce a novel approach called <i>Series2Vec</i> for self-supervised representation learning. Unlike the state-of-the-art methods in time series which rely on hand-crafted data augmentation, Series2Vec is trained by predicting the similarity between two series in both temporal and spectral domains through a self-supervised task. By leveraging the similarity prediction task, which has inherent meaning for a wide range of time series analysis tasks, Series2Vec eliminates the need for hand-crafted data augmentation. To further enforce the network to learn similar representations for similar time series, we propose a novel approach that applies order-invariant attention to each representation within the batch during training. Our evaluation of Series2Vec on nine large real-world datasets, along with the UCR/UEA archive, shows enhanced performance compared to current state-of-the-art self-supervised techniques for time series. Additionally, our extensive experiments show that Series2Vec performs comparably with fully supervised training and offers high efficiency in datasets with limited-labeled data. Finally, we show that the fusion of Series2Vec with other representation learning models leads to enhanced performance for time series classification. Code and models are open-source at https://github.com/Navidfoumani/Series2Vec</p>","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":"139 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Mining and Knowledge Discovery","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10618-024-01043-w","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

We argue that time series analysis is fundamentally different in nature to either vision or natural language processing with respect to the forms of meaningful self-supervised learning tasks that can be defined. Motivated by this insight, we introduce a novel approach called Series2Vec for self-supervised representation learning. Unlike the state-of-the-art methods in time series which rely on hand-crafted data augmentation, Series2Vec is trained by predicting the similarity between two series in both temporal and spectral domains through a self-supervised task. By leveraging the similarity prediction task, which has inherent meaning for a wide range of time series analysis tasks, Series2Vec eliminates the need for hand-crafted data augmentation. To further enforce the network to learn similar representations for similar time series, we propose a novel approach that applies order-invariant attention to each representation within the batch during training. Our evaluation of Series2Vec on nine large real-world datasets, along with the UCR/UEA archive, shows enhanced performance compared to current state-of-the-art self-supervised techniques for time series. Additionally, our extensive experiments show that Series2Vec performs comparably with fully supervised training and offers high efficiency in datasets with limited-labeled data. Finally, we show that the fusion of Series2Vec with other representation learning models leads to enhanced performance for time series classification. Code and models are open-source at https://github.com/Navidfoumani/Series2Vec

Abstract Image

查看原文本刊更多论文

Series2vec：基于相似性的时间序列分类自监督表示学习

我们认为，就可定义的有意义的自我监督学习任务形式而言，时间序列分析与视觉或自然语言处理在本质上有着根本的不同。受此启发，我们为自我监督表示学习引入了一种名为 Series2Vec 的新方法。与时间序列中依赖手工创建数据增强的最先进方法不同，Series2Vec 是通过自监督任务预测两个序列在时间和频谱领域的相似性来进行训练的。通过利用对各种时间序列分析任务具有内在意义的相似性预测任务，Series2Vec 消除了手工制作数据扩增的需要。为了进一步强制网络学习相似时间序列的相似表示，我们提出了一种新方法，即在训练期间对批次中的每个表示应用顺序不变的关注。我们在九个大型真实数据集和 UCR/UEA 档案中对 Series2Vec 进行了评估，结果表明，与当前最先进的时间序列自监督技术相比，Series2Vec 的性能有所提高。此外，我们的大量实验表明，Series2Vec 的性能可与完全监督式训练相媲美，并能在标签数据有限的数据集中提供高效率。最后，我们还展示了 Series2Vec 与其他表示学习模型的融合，从而提高了时间序列分类的性能。代码和模型开源于 https://github.com/Navidfoumani/Series2Vec

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Data Mining and Knowledge Discovery 工程技术-计算机：人工智能

CiteScore

10.40

自引率

4.20%

发文量

审稿时长

10 months

期刊介绍： Advances in data gathering, storage, and distribution have created a need for computational tools and techniques to aid in data analysis. Data Mining and Knowledge Discovery in Databases (KDD) is a rapidly growing area of research and application that builds on techniques and theories from many fields, including statistics, databases, pattern recognition and learning, data visualization, uncertainty modelling, data warehousing and OLAP, optimization, and high performance computing.