Enjie Ding, Dawei Xu, Yingfei Zhao, Zhongyu Liu, Yafeng Liu
{"title":"Attention-based 3D convolutional networks","authors":"Enjie Ding, Dawei Xu, Yingfei Zhao, Zhongyu Liu, Yafeng Liu","doi":"10.1080/0952813X.2021.1960625","DOIUrl":null,"url":null,"abstract":"ABSTRACT Being simple and portable, the three-dimensional (3D) convolution network has achieved great success in action recognition. However, its applicability in spatiotemporal feature learning is not evident. This study aims to improve the 3D convolution model and propose a flexible and significant attention module for the extraction of spatiotemporal information. Our first contribution is a self-additive attention module and a feature-based attention module, which is a simple yet effective method for measuring the spatiotemporal importance of a video. In self-additive attention, the spatiotemporal fusion between the frames is defined intuitively, where we set equivalent weights between the video frames manually. Further, the feature-based attention that is trained adaptively by the 3D convolution process combines the spatiotemporal information from the feature map. This study also focuses on attention fusion in learning the spatiotemporal characteristics for 3D convolution. The proposed attention fusion method exhibits outstanding performance in comparison to the recently developed attention modules and the latest 3D networks when applied to the data from the UCF101 and HMDB51 datasets. The experiments show consistent improvements, affirming the robustness of the method in extracting spatiotemporal attention.","PeriodicalId":15677,"journal":{"name":"Journal of Experimental & Theoretical Artificial Intelligence","volume":"13 1","pages":"93 - 108"},"PeriodicalIF":1.7000,"publicationDate":"2022-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Experimental & Theoretical Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1080/0952813X.2021.1960625","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 1
Abstract
ABSTRACT Being simple and portable, the three-dimensional (3D) convolution network has achieved great success in action recognition. However, its applicability in spatiotemporal feature learning is not evident. This study aims to improve the 3D convolution model and propose a flexible and significant attention module for the extraction of spatiotemporal information. Our first contribution is a self-additive attention module and a feature-based attention module, which is a simple yet effective method for measuring the spatiotemporal importance of a video. In self-additive attention, the spatiotemporal fusion between the frames is defined intuitively, where we set equivalent weights between the video frames manually. Further, the feature-based attention that is trained adaptively by the 3D convolution process combines the spatiotemporal information from the feature map. This study also focuses on attention fusion in learning the spatiotemporal characteristics for 3D convolution. The proposed attention fusion method exhibits outstanding performance in comparison to the recently developed attention modules and the latest 3D networks when applied to the data from the UCF101 and HMDB51 datasets. The experiments show consistent improvements, affirming the robustness of the method in extracting spatiotemporal attention.
期刊介绍:
Journal of Experimental & Theoretical Artificial Intelligence (JETAI) is a world leading journal dedicated to publishing high quality, rigorously reviewed, original papers in artificial intelligence (AI) research.
The journal features work in all subfields of AI research and accepts both theoretical and applied research. Topics covered include, but are not limited to, the following:
• cognitive science
• games
• learning
• knowledge representation
• memory and neural system modelling
• perception
• problem-solving