Gongguan Chen, Hua Wang, Yepeng Liu, Mingli Zhang, Fan Zhang
{"title":"Resformer: Combine quadratic linear transformation with efficient sparse Transformer for long-term series forecasting","authors":"Gongguan Chen, Hua Wang, Yepeng Liu, Mingli Zhang, Fan Zhang","doi":"10.3233/ida-227006","DOIUrl":null,"url":null,"abstract":"With the continuous development of deep learning, long sequence time-series forecasting (LSTF) has attracted more and more attention in power consumption prediction, traffic prediction and stock prediction. In recent studies, various improved models of Transformer are favored. While these models have made breakthroughs in reducing the time and space complexity of Transformer, there are still some problems, such as the predictive power of the improved model being slightly lower than that of Transformer. And these models ignore the importance of special values in the time series. To solve these problems, we designed a more concise network named Resformer, which has four significant characteristics: (1) The fully sparse self-attention mechanism achieves O(𝐿𝑙𝑜𝑔𝐿) time complexity. (2) The AMS module is used to process the special values of time series and has comparable performance on sequences dependency alignment. (3) Using quadratic linear transformation, a simple LT module is designed to replace the self-attention mechanism. It effectively reduces redundant information. (4) The DistPooling method based on data distribution is proposed to suppress redundant information and noise. A large number of experiments on real data sets show that the Resformer method is superior to the existing improved model and standard Transformer method.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"37 1","pages":"0"},"PeriodicalIF":0.9000,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Data Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/ida-227006","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
With the continuous development of deep learning, long sequence time-series forecasting (LSTF) has attracted more and more attention in power consumption prediction, traffic prediction and stock prediction. In recent studies, various improved models of Transformer are favored. While these models have made breakthroughs in reducing the time and space complexity of Transformer, there are still some problems, such as the predictive power of the improved model being slightly lower than that of Transformer. And these models ignore the importance of special values in the time series. To solve these problems, we designed a more concise network named Resformer, which has four significant characteristics: (1) The fully sparse self-attention mechanism achieves O(𝐿𝑙𝑜𝑔𝐿) time complexity. (2) The AMS module is used to process the special values of time series and has comparable performance on sequences dependency alignment. (3) Using quadratic linear transformation, a simple LT module is designed to replace the self-attention mechanism. It effectively reduces redundant information. (4) The DistPooling method based on data distribution is proposed to suppress redundant information and noise. A large number of experiments on real data sets show that the Resformer method is superior to the existing improved model and standard Transformer method.
期刊介绍:
Intelligent Data Analysis provides a forum for the examination of issues related to the research and applications of Artificial Intelligence techniques in data analysis across a variety of disciplines. These techniques include (but are not limited to): all areas of data visualization, data pre-processing (fusion, editing, transformation, filtering, sampling), data engineering, database mining techniques, tools and applications, use of domain knowledge in data analysis, big data applications, evolutionary algorithms, machine learning, neural nets, fuzzy logic, statistical pattern recognition, knowledge filtering, and post-processing. In particular, papers are preferred that discuss development of new AI related data analysis architectures, methodologies, and techniques and their applications to various domains.