AGCNT:基于变压器长序列时间序列预测的自适应图卷积网络

Proceedings of the 30th ACM International Conference on Information & Knowledge Management Pub Date : 2021-10-26 DOI:10.1145/3459637.3482054

Hongyang Su, Xiaolong Wang, Yang Qin

{"title":"AGCNT:基于变压器长序列时间序列预测的自适应图卷积网络","authors":"Hongyang Su, Xiaolong Wang, Yang Qin","doi":"10.1145/3459637.3482054","DOIUrl":null,"url":null,"abstract":"Long sequence time-series forecasting(LSTF) plays an important role in a variety of real-world application scenarios, such as electricity forecasting, weather forecasting, and traffic flow forecasting. It has previously been observed that transformer-based models have achieved outstanding results on LSTF tasks, which can reduce the complexity of the model and maintain stable prediction accuracy. Nevertheless, there are still some issues that limit the performance of transformer-based models for LSTF tasks: (i) the potential correlation between sequences is not considered; (ii) the inherent structure of encoder-decoder is difficult to expand after being optimized from the aspect of complexity. In order to solve these two problems, we propose a transformer-based model, named AGCNT, which is efficient and can capture the correlation between the sequences in the multivariate LSTF task without causing the memory bottleneck. Specifically, AGCNT has several characteristics: (i) a probsparse adaptive graph self-attention, which maps long sequences into a low-dimensional dense graph structure with an adaptive graph generation and captures the relationships between sequences with an adaptive graph convolution; (ii) the stacked encoder with distilling probsparse graph self-attention integrates the graph attention mechanism and retains the dominant attention of the cascade layer, which preserves the correlation between sparse queries from long sequences; (iii) the stacked decoder with generative inference generates all prediction values in one forward operation, which can improve the inference speed of long-term predictions. Experimental results on 4 large-scale datasets demonstrate the AGCNT outperforms state-of-the-art baselines.","PeriodicalId":405296,"journal":{"name":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","volume":"2169 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"AGCNT: Adaptive Graph Convolutional Network for Transformer-based Long Sequence Time-Series Forecasting\",\"authors\":\"Hongyang Su, Xiaolong Wang, Yang Qin\",\"doi\":\"10.1145/3459637.3482054\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Long sequence time-series forecasting(LSTF) plays an important role in a variety of real-world application scenarios, such as electricity forecasting, weather forecasting, and traffic flow forecasting. It has previously been observed that transformer-based models have achieved outstanding results on LSTF tasks, which can reduce the complexity of the model and maintain stable prediction accuracy. Nevertheless, there are still some issues that limit the performance of transformer-based models for LSTF tasks: (i) the potential correlation between sequences is not considered; (ii) the inherent structure of encoder-decoder is difficult to expand after being optimized from the aspect of complexity. In order to solve these two problems, we propose a transformer-based model, named AGCNT, which is efficient and can capture the correlation between the sequences in the multivariate LSTF task without causing the memory bottleneck. Specifically, AGCNT has several characteristics: (i) a probsparse adaptive graph self-attention, which maps long sequences into a low-dimensional dense graph structure with an adaptive graph generation and captures the relationships between sequences with an adaptive graph convolution; (ii) the stacked encoder with distilling probsparse graph self-attention integrates the graph attention mechanism and retains the dominant attention of the cascade layer, which preserves the correlation between sparse queries from long sequences; (iii) the stacked decoder with generative inference generates all prediction values in one forward operation, which can improve the inference speed of long-term predictions. Experimental results on 4 large-scale datasets demonstrate the AGCNT outperforms state-of-the-art baselines.\",\"PeriodicalId\":405296,\"journal\":{\"name\":\"Proceedings of the 30th ACM International Conference on Information & Knowledge Management\",\"volume\":\"2169 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 30th ACM International Conference on Information & Knowledge Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3459637.3482054\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM International Conference on Information & Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3459637.3482054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

长序列时间序列预测(LSTF)在电力预测、天气预报、交通流量预测等多种现实应用场景中发挥着重要作用。之前已经观察到，基于变压器的模型在LSTF任务上取得了出色的效果，可以降低模型的复杂性并保持稳定的预测精度。然而，仍然存在一些问题限制了基于变压器的LSTF任务模型的性能:(1)未考虑序列之间的潜在相关性;(ii)编码器-解码器的固有结构从复杂度方面进行优化后难以扩展。为了解决这两个问题，我们提出了一种基于变压器的AGCNT模型，该模型可以高效地捕获多变量LSTF任务中序列之间的相关性，而不会造成内存瓶颈。具体而言，AGCNT具有以下几个特点:(1)概率稀疏自适应图自注意，它通过自适应图生成将长序列映射到低维密集图结构中，并通过自适应图卷积捕获序列之间的关系;(ii)带有提取概率稀疏图自注意的堆叠编码器集成了图注意机制，保留了级联层的主导注意，保留了长序列稀疏查询之间的相关性;(iii)具有生成推理的堆叠解码器在一次前向运算中生成所有预测值，可以提高长期预测的推理速度。在4个大规模数据集上的实验结果表明，AGCNT优于最先进的基线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

AGCNT: Adaptive Graph Convolutional Network for Transformer-based Long Sequence Time-Series Forecasting

Long sequence time-series forecasting(LSTF) plays an important role in a variety of real-world application scenarios, such as electricity forecasting, weather forecasting, and traffic flow forecasting. It has previously been observed that transformer-based models have achieved outstanding results on LSTF tasks, which can reduce the complexity of the model and maintain stable prediction accuracy. Nevertheless, there are still some issues that limit the performance of transformer-based models for LSTF tasks: (i) the potential correlation between sequences is not considered; (ii) the inherent structure of encoder-decoder is difficult to expand after being optimized from the aspect of complexity. In order to solve these two problems, we propose a transformer-based model, named AGCNT, which is efficient and can capture the correlation between the sequences in the multivariate LSTF task without causing the memory bottleneck. Specifically, AGCNT has several characteristics: (i) a probsparse adaptive graph self-attention, which maps long sequences into a low-dimensional dense graph structure with an adaptive graph generation and captures the relationships between sequences with an adaptive graph convolution; (ii) the stacked encoder with distilling probsparse graph self-attention integrates the graph attention mechanism and retains the dominant attention of the cascade layer, which preserves the correlation between sparse queries from long sequences; (iii) the stacked decoder with generative inference generates all prediction values in one forward operation, which can improve the inference speed of long-term predictions. Experimental results on 4 large-scale datasets demonstrate the AGCNT outperforms state-of-the-art baselines.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 30th ACM International Conference on Information & Knowledge Management

自引率

0.00%

发文量