Rogeany Kanza, Chenyu Huang, Allah Rakhio Junejo, Zhuoming Li
{"title":"压缩域压缩视频动作识别的权值初始化方法","authors":"Rogeany Kanza, Chenyu Huang, Allah Rakhio Junejo, Zhuoming Li","doi":"10.1145/3573942.3574089","DOIUrl":null,"url":null,"abstract":"The exponential evolution of big data with its increasing volumes, especially when it comes to videos from smart devices and video sites, has become a real challenge to video analysis tasks algorithms. Processing and storage difficulties are the main problems for these traditional video processing architectures that mostly use RGB frames for video analysis tasks. The process of decoding compressed videos is time-consuming and requires a lot of storage space. Although existing convolutional neural networks (CNNs) based video analysis architectures have realized notable advancements, they still hardly meet the requirements of many real-time scenarios and real-world applications. This is one of the motivations for the computer vision community to move to action recognition with compressed domain compressed videos in order to overcome the aforementioned issues. On the other hand, the performance of prominent methods is very dependent on the correct setting of initialization parameters. The choice of initialization has an impact on the final generalization performance of a neural network. This work proposes a weight initialization technique in compressed domain for compressed videos action recognition tasks. Our approach was tested on UFC-101 and HDBM-51 datasets. The performance evaluation shows the effectiveness of our proposed methodology.","PeriodicalId":103293,"journal":{"name":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Weight Initialization Method for Compressed Video Action Recognition in Compressed Domain\",\"authors\":\"Rogeany Kanza, Chenyu Huang, Allah Rakhio Junejo, Zhuoming Li\",\"doi\":\"10.1145/3573942.3574089\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The exponential evolution of big data with its increasing volumes, especially when it comes to videos from smart devices and video sites, has become a real challenge to video analysis tasks algorithms. Processing and storage difficulties are the main problems for these traditional video processing architectures that mostly use RGB frames for video analysis tasks. The process of decoding compressed videos is time-consuming and requires a lot of storage space. Although existing convolutional neural networks (CNNs) based video analysis architectures have realized notable advancements, they still hardly meet the requirements of many real-time scenarios and real-world applications. This is one of the motivations for the computer vision community to move to action recognition with compressed domain compressed videos in order to overcome the aforementioned issues. On the other hand, the performance of prominent methods is very dependent on the correct setting of initialization parameters. The choice of initialization has an impact on the final generalization performance of a neural network. This work proposes a weight initialization technique in compressed domain for compressed videos action recognition tasks. Our approach was tested on UFC-101 and HDBM-51 datasets. The performance evaluation shows the effectiveness of our proposed methodology.\",\"PeriodicalId\":103293,\"journal\":{\"name\":\"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3573942.3574089\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3573942.3574089","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Weight Initialization Method for Compressed Video Action Recognition in Compressed Domain
The exponential evolution of big data with its increasing volumes, especially when it comes to videos from smart devices and video sites, has become a real challenge to video analysis tasks algorithms. Processing and storage difficulties are the main problems for these traditional video processing architectures that mostly use RGB frames for video analysis tasks. The process of decoding compressed videos is time-consuming and requires a lot of storage space. Although existing convolutional neural networks (CNNs) based video analysis architectures have realized notable advancements, they still hardly meet the requirements of many real-time scenarios and real-world applications. This is one of the motivations for the computer vision community to move to action recognition with compressed domain compressed videos in order to overcome the aforementioned issues. On the other hand, the performance of prominent methods is very dependent on the correct setting of initialization parameters. The choice of initialization has an impact on the final generalization performance of a neural network. This work proposes a weight initialization technique in compressed domain for compressed videos action recognition tasks. Our approach was tested on UFC-101 and HDBM-51 datasets. The performance evaluation shows the effectiveness of our proposed methodology.