{"title":"A Data-Level Augmentation Framework for Time Series Forecasting With Ambiguously Related Source Data","authors":"Rui Ye;Qun Dai","doi":"10.1109/TKDE.2025.3555530","DOIUrl":null,"url":null,"abstract":"Many practical time series forecasting (TSF) tasks are plagued by data limitations. To alleviate this challenge, we design a data-level augmentation framework. It involves a time series generation (TSG) module and a source data selection (Sel-src) module. TSG aims to achieve better generation results by considering both the global profile and temporal dynamics of series. However, when only few target data is available, TSG module may tend to simulate the limited target samples, leading to poor generalization performance. A natural idea for this problem is to seek help from related source domain, which can provide additional useful information for TSG module. Here we consider a more complex situation, where the relevance between source and target domains is ambiguous. That is, irrelevant samples may exist in the source domain. Blindly using all the source data may lead to counterproductive results. To meet this challenge, Sel-src module is designed to select effective source samples by Inter-Representation Learning (Inter-RL) and Intra-Representation Learning (Intra-RL). Effectiveness of this algorithm is underpinned from two aspects: the quality of the augmented data and the accuracy improvement upon the augmentation.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"3855-3868"},"PeriodicalIF":10.4000,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10949281/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Many practical time series forecasting (TSF) tasks are plagued by data limitations. To alleviate this challenge, we design a data-level augmentation framework. It involves a time series generation (TSG) module and a source data selection (Sel-src) module. TSG aims to achieve better generation results by considering both the global profile and temporal dynamics of series. However, when only few target data is available, TSG module may tend to simulate the limited target samples, leading to poor generalization performance. A natural idea for this problem is to seek help from related source domain, which can provide additional useful information for TSG module. Here we consider a more complex situation, where the relevance between source and target domains is ambiguous. That is, irrelevant samples may exist in the source domain. Blindly using all the source data may lead to counterproductive results. To meet this challenge, Sel-src module is designed to select effective source samples by Inter-Representation Learning (Inter-RL) and Intra-Representation Learning (Intra-RL). Effectiveness of this algorithm is underpinned from two aspects: the quality of the augmented data and the accuracy improvement upon the augmentation.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.