Chixuan Wei;Jidong Yuan;Yi Zhang;Zhongyang Yu;Yanze Liu;Haiyang Liu
{"title":"Ranking Neighborhood and Class Prototype Contrastive Learning for Time Series","authors":"Chixuan Wei;Jidong Yuan;Yi Zhang;Zhongyang Yu;Yanze Liu;Haiyang Liu","doi":"10.1109/TBDATA.2024.3495509","DOIUrl":null,"url":null,"abstract":"Time series are often complex and rich in information but sparsely labeled and therefore challenging to model. Existing contrastive learning methods conduct augmentations and maximize their similarity. However, they ignore the similarity of adjacent timestamps and suffer from the problem of sampling bias. In this paper, we propose a self-supervised framework for learning generalizable representations of time series, called <inline-formula><tex-math>$\\mathbf {R}$</tex-math></inline-formula>anking n<inline-formula><tex-math>$\\mathbf {E}$</tex-math></inline-formula> ighborhood and cla<inline-formula><tex-math>$\\mathbf {S}$</tex-math></inline-formula>s prototyp<inline-formula><tex-math>$\\mathbf {E}$</tex-math></inline-formula> contr<inline-formula><tex-math>$\\mathbf {A}$</tex-math></inline-formula>stive <inline-formula><tex-math>$\\mathbf {L}$</tex-math></inline-formula>earning (RESEAL). It exploits information about similarity ranking to learn an embedding space, ensuring that positive samples are ranked according to their temporal order. Additionally, RESEAL introduces a class prototype contrastive learning module. It contrasts time series representations and their corresponding centroids as positives against truly negative pairs from different clusters, mitigating the sampling bias issue. Extensive experiments conducted on several multivariate and univariate time series tasks (i.e., classification, anomaly detection, and forecasting) demonstrate that our representation framework achieves significant improvement over existing baselines of self-supervised time series representation.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 4","pages":"1907-1917"},"PeriodicalIF":5.7000,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10748408/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Time series are often complex and rich in information but sparsely labeled and therefore challenging to model. Existing contrastive learning methods conduct augmentations and maximize their similarity. However, they ignore the similarity of adjacent timestamps and suffer from the problem of sampling bias. In this paper, we propose a self-supervised framework for learning generalizable representations of time series, called $\mathbf {R}$anking n$\mathbf {E}$ ighborhood and cla$\mathbf {S}$s prototyp$\mathbf {E}$ contr$\mathbf {A}$stive $\mathbf {L}$earning (RESEAL). It exploits information about similarity ranking to learn an embedding space, ensuring that positive samples are ranked according to their temporal order. Additionally, RESEAL introduces a class prototype contrastive learning module. It contrasts time series representations and their corresponding centroids as positives against truly negative pairs from different clusters, mitigating the sampling bias issue. Extensive experiments conducted on several multivariate and univariate time series tasks (i.e., classification, anomaly detection, and forecasting) demonstrate that our representation framework achieves significant improvement over existing baselines of self-supervised time series representation.
期刊介绍:
The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.