重新思考时间序列分类中的注意机制

Bowen Zhao, Huanlai Xing, Xinhan Wang, Fuhong Song, Zhiwen Xiao
{"title":"重新思考时间序列分类中的注意机制","authors":"Bowen Zhao, Huanlai Xing, Xinhan Wang, Fuhong Song, Zhiwen Xiao","doi":"10.48550/arXiv.2207.07564","DOIUrl":null,"url":null,"abstract":"Attention-based models have been widely used in many areas, such as computer vision and natural language processing. However, relevant applications in time series classification (TSC) have not been explored deeply yet, causing a significant number of TSC algorithms still suffer from general problems of attention mechanism, like quadratic complexity. In this paper, we promote the efficiency and performance of the attention mechanism by proposing our flexible multi-head linear attention (FMLA), which enhances locality awareness by layer-wise interactions with deformable convolutional blocks and online knowledge distillation. What's more, we propose a simple but effective mask mechanism that helps reduce the noise influence in time series and decrease the redundancy of the proposed FMLA by masking some positions of each given series proportionally. To stabilize this mechanism, samples are forwarded through the model with random mask layers several times and their outputs are aggregated to teach the same model with regular mask layers. We conduct extensive experiments on 85 UCR2018 datasets to compare our algorithm with 11 well-known ones and the results show that our algorithm has comparable performance in terms of top-1 accuracy. We also compare our model with three Transformer-based models with respect to the floating-point operations per second and number of parameters and find that our algorithm achieves significantly better efficiency with lower complexity.","PeriodicalId":13641,"journal":{"name":"Inf. Sci.","volume":"107 1","pages":"97-114"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Rethinking Attention Mechanism in Time Series Classification\",\"authors\":\"Bowen Zhao, Huanlai Xing, Xinhan Wang, Fuhong Song, Zhiwen Xiao\",\"doi\":\"10.48550/arXiv.2207.07564\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Attention-based models have been widely used in many areas, such as computer vision and natural language processing. However, relevant applications in time series classification (TSC) have not been explored deeply yet, causing a significant number of TSC algorithms still suffer from general problems of attention mechanism, like quadratic complexity. In this paper, we promote the efficiency and performance of the attention mechanism by proposing our flexible multi-head linear attention (FMLA), which enhances locality awareness by layer-wise interactions with deformable convolutional blocks and online knowledge distillation. What's more, we propose a simple but effective mask mechanism that helps reduce the noise influence in time series and decrease the redundancy of the proposed FMLA by masking some positions of each given series proportionally. To stabilize this mechanism, samples are forwarded through the model with random mask layers several times and their outputs are aggregated to teach the same model with regular mask layers. We conduct extensive experiments on 85 UCR2018 datasets to compare our algorithm with 11 well-known ones and the results show that our algorithm has comparable performance in terms of top-1 accuracy. We also compare our model with three Transformer-based models with respect to the floating-point operations per second and number of parameters and find that our algorithm achieves significantly better efficiency with lower complexity.\",\"PeriodicalId\":13641,\"journal\":{\"name\":\"Inf. Sci.\",\"volume\":\"107 1\",\"pages\":\"97-114\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Inf. Sci.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2207.07564\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Inf. Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2207.07564","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

基于注意力的模型已广泛应用于计算机视觉和自然语言处理等领域。然而,相关在时间序列分类(TSC)中的应用尚未深入探索,导致相当多的TSC算法仍然存在一般性的注意机制问题,如二次复杂度。在本文中,我们提出了一种灵活的多头线性注意(FMLA),通过与可变形卷积块的分层交互和在线知识蒸馏来增强局部意识,从而提高了注意机制的效率和性能。此外,我们提出了一种简单而有效的掩膜机制,通过按比例掩膜每个给定序列的某些位置,有助于降低时间序列中的噪声影响并降低所提出的FMLA的冗余度。为了稳定这一机制,样本在随机蒙版层的模型中转发多次,并将其输出聚合到具有规则蒙版层的相同模型中。我们在85个UCR2018数据集上进行了大量的实验,将我们的算法与11个知名的算法进行了比较,结果表明我们的算法在top-1准确率方面具有相当的性能。我们还将我们的模型与三种基于transformer的模型在每秒浮点运算次数和参数数量方面进行了比较,发现我们的算法以更低的复杂性实现了明显更好的效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Rethinking Attention Mechanism in Time Series Classification
Attention-based models have been widely used in many areas, such as computer vision and natural language processing. However, relevant applications in time series classification (TSC) have not been explored deeply yet, causing a significant number of TSC algorithms still suffer from general problems of attention mechanism, like quadratic complexity. In this paper, we promote the efficiency and performance of the attention mechanism by proposing our flexible multi-head linear attention (FMLA), which enhances locality awareness by layer-wise interactions with deformable convolutional blocks and online knowledge distillation. What's more, we propose a simple but effective mask mechanism that helps reduce the noise influence in time series and decrease the redundancy of the proposed FMLA by masking some positions of each given series proportionally. To stabilize this mechanism, samples are forwarded through the model with random mask layers several times and their outputs are aggregated to teach the same model with regular mask layers. We conduct extensive experiments on 85 UCR2018 datasets to compare our algorithm with 11 well-known ones and the results show that our algorithm has comparable performance in terms of top-1 accuracy. We also compare our model with three Transformer-based models with respect to the floating-point operations per second and number of parameters and find that our algorithm achieves significantly better efficiency with lower complexity.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信