{"title":"增强的反向变压器:用于时间序列预测的改进变量标记编码和混合","authors":"Xin-Yi Li, Yu-Bin Yang","doi":"10.1007/s10489-025-06886-4","DOIUrl":null,"url":null,"abstract":"<div><p>Recent advancements in channel-dependent Transformer-based forecasters highlight the efficacy of variate tokenization for time series forecasting. Despite this progress, challenges remain in handling complex time series. The vanilla Transformer, while effective in certain scenarios, faces limitations in addressing intricate cross-variate interactions and diverse temporal patterns. This paper presents the Enhanced Inverted Transformer (EiT for short), enhancing standard Transformer blocks for advanced modeling and blending of variate tokens. EiT incorporates three key innovations: First, a hybrid multi-patch attention mechanism that adaptively fuses global and local attention maps, capturing both stable and volatile correlations to mitigate overfitting and enrich inter-channel communication. Second, a multi-head feed-forward network with specialized heads for various temporal patterns, enhancing parameter efficiency and contributing to robust multivariate predictions. Third, paired channel normalization applied to each layer, preserving crucial channel-specific statistics and boosting forecasting performance. By integrating these innovations, EiT effectively overcomes limitations and unlocks the potential of variate tokens for accurate and robust multivariate time series forecasting. Extensive evaluations demonstrate that EiT achieves state-of-the-art (SOTA) performance, surpassing the previous method, the inverted Transformer, by an average of 4.4% in Mean Squared Error (MSE) and 3.4% in Mean Absolute Error (MAE) across five challenging long-term forecasting datasets.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 15","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhanced inverted transformer: advancing variate token encoding and blending for time series forecasting\",\"authors\":\"Xin-Yi Li, Yu-Bin Yang\",\"doi\":\"10.1007/s10489-025-06886-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Recent advancements in channel-dependent Transformer-based forecasters highlight the efficacy of variate tokenization for time series forecasting. Despite this progress, challenges remain in handling complex time series. The vanilla Transformer, while effective in certain scenarios, faces limitations in addressing intricate cross-variate interactions and diverse temporal patterns. This paper presents the Enhanced Inverted Transformer (EiT for short), enhancing standard Transformer blocks for advanced modeling and blending of variate tokens. EiT incorporates three key innovations: First, a hybrid multi-patch attention mechanism that adaptively fuses global and local attention maps, capturing both stable and volatile correlations to mitigate overfitting and enrich inter-channel communication. Second, a multi-head feed-forward network with specialized heads for various temporal patterns, enhancing parameter efficiency and contributing to robust multivariate predictions. Third, paired channel normalization applied to each layer, preserving crucial channel-specific statistics and boosting forecasting performance. By integrating these innovations, EiT effectively overcomes limitations and unlocks the potential of variate tokens for accurate and robust multivariate time series forecasting. Extensive evaluations demonstrate that EiT achieves state-of-the-art (SOTA) performance, surpassing the previous method, the inverted Transformer, by an average of 4.4% in Mean Squared Error (MSE) and 3.4% in Mean Absolute Error (MAE) across five challenging long-term forecasting datasets.</p></div>\",\"PeriodicalId\":8041,\"journal\":{\"name\":\"Applied Intelligence\",\"volume\":\"55 15\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10489-025-06886-4\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06886-4","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Enhanced inverted transformer: advancing variate token encoding and blending for time series forecasting
Recent advancements in channel-dependent Transformer-based forecasters highlight the efficacy of variate tokenization for time series forecasting. Despite this progress, challenges remain in handling complex time series. The vanilla Transformer, while effective in certain scenarios, faces limitations in addressing intricate cross-variate interactions and diverse temporal patterns. This paper presents the Enhanced Inverted Transformer (EiT for short), enhancing standard Transformer blocks for advanced modeling and blending of variate tokens. EiT incorporates three key innovations: First, a hybrid multi-patch attention mechanism that adaptively fuses global and local attention maps, capturing both stable and volatile correlations to mitigate overfitting and enrich inter-channel communication. Second, a multi-head feed-forward network with specialized heads for various temporal patterns, enhancing parameter efficiency and contributing to robust multivariate predictions. Third, paired channel normalization applied to each layer, preserving crucial channel-specific statistics and boosting forecasting performance. By integrating these innovations, EiT effectively overcomes limitations and unlocks the potential of variate tokens for accurate and robust multivariate time series forecasting. Extensive evaluations demonstrate that EiT achieves state-of-the-art (SOTA) performance, surpassing the previous method, the inverted Transformer, by an average of 4.4% in Mean Squared Error (MSE) and 3.4% in Mean Absolute Error (MAE) across five challenging long-term forecasting datasets.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.