A Weight Initialization Method for Compressed Video Action Recognition in Compressed Domain

Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition Pub Date : 2022-09-23 DOI:10.1145/3573942.3574089

Rogeany Kanza, Chenyu Huang, Allah Rakhio Junejo, Zhuoming Li

{"title":"A Weight Initialization Method for Compressed Video Action Recognition in Compressed Domain","authors":"Rogeany Kanza, Chenyu Huang, Allah Rakhio Junejo, Zhuoming Li","doi":"10.1145/3573942.3574089","DOIUrl":null,"url":null,"abstract":"The exponential evolution of big data with its increasing volumes, especially when it comes to videos from smart devices and video sites, has become a real challenge to video analysis tasks algorithms. Processing and storage difficulties are the main problems for these traditional video processing architectures that mostly use RGB frames for video analysis tasks. The process of decoding compressed videos is time-consuming and requires a lot of storage space. Although existing convolutional neural networks (CNNs) based video analysis architectures have realized notable advancements, they still hardly meet the requirements of many real-time scenarios and real-world applications. This is one of the motivations for the computer vision community to move to action recognition with compressed domain compressed videos in order to overcome the aforementioned issues. On the other hand, the performance of prominent methods is very dependent on the correct setting of initialization parameters. The choice of initialization has an impact on the final generalization performance of a neural network. This work proposes a weight initialization technique in compressed domain for compressed videos action recognition tasks. Our approach was tested on UFC-101 and HDBM-51 datasets. The performance evaluation shows the effectiveness of our proposed methodology.","PeriodicalId":103293,"journal":{"name":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3573942.3574089","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The exponential evolution of big data with its increasing volumes, especially when it comes to videos from smart devices and video sites, has become a real challenge to video analysis tasks algorithms. Processing and storage difficulties are the main problems for these traditional video processing architectures that mostly use RGB frames for video analysis tasks. The process of decoding compressed videos is time-consuming and requires a lot of storage space. Although existing convolutional neural networks (CNNs) based video analysis architectures have realized notable advancements, they still hardly meet the requirements of many real-time scenarios and real-world applications. This is one of the motivations for the computer vision community to move to action recognition with compressed domain compressed videos in order to overcome the aforementioned issues. On the other hand, the performance of prominent methods is very dependent on the correct setting of initialization parameters. The choice of initialization has an impact on the final generalization performance of a neural network. This work proposes a weight initialization technique in compressed domain for compressed videos action recognition tasks. Our approach was tested on UFC-101 and HDBM-51 datasets. The performance evaluation shows the effectiveness of our proposed methodology.

查看原文本刊更多论文

压缩域压缩视频动作识别的权值初始化方法

随着大数据量的增长，尤其是来自智能设备和视频网站的视频，大数据的指数级发展已经成为视频分析任务算法的真正挑战。处理和存储困难是传统视频处理架构的主要问题，这些传统的视频处理架构主要使用RGB帧进行视频分析任务。解码压缩视频的过程非常耗时，并且需要大量的存储空间。尽管现有的基于卷积神经网络(cnn)的视频分析架构已经取得了显著的进步，但它们仍然难以满足许多实时场景和现实应用的需求。这是计算机视觉社区为了克服上述问题而转向使用压缩域压缩视频进行动作识别的动机之一。另一方面，突出方法的性能非常依赖于初始化参数的正确设置。初始化的选择直接影响神经网络的最终泛化性能。本文提出了一种用于压缩视频动作识别任务的压缩域权值初始化技术。我们的方法在UFC-101和HDBM-51数据集上进行了测试。绩效评估显示了我们提出的方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

自引率

0.00%

发文量