{"title":"基于注意机制的小型目标车辆行为识别网络","authors":"Zhuoyi Wu, Zhiqiang Ma, Caijilahu Bao, Leixiao Li, Xiaoxu Zhang, Fangyuan Zhu","doi":"10.1145/3568364.3568365","DOIUrl":null,"url":null,"abstract":"Vehicle behavior recognition is one of the most important research fields in intelligent transportation. In the field of traffic surveillance videos, target vehicles only occupy small proportion of the video frame,making it difficult for the network to extract key features of small target vehicles. To solve this problem, this paper constructed vehicle-3C, a video dataset for vehicle behavior recognition, and was combined with deep learning theory. As a result, it proposed a vehicle behavior recognition method based on the theory of Two-Stream with Attention Network (TSAN). For this method, the two-stream convolutional network is used as a basic framework and attention units are embedded in it to extract temporal (motion) features and spatial features of target vehicles in traffic surveillance videos, and then the temporal and spatial features are fused for category judgment. The results showed that TSAN could achieve 81.8% identification accuracy in vehicle-3C dataset, which was better than other behavior recognition methods based on deep learning. In addition, TSAN also achieves a recognition accuracy of 77.2% in dataset UCF-101, which verifies the generalization performance of the network. Experimental results showed that TSAN could accurately extract and effectively fuse the temporal and spatial features of the foreground target vehicle in the video and achieve high recognition accuracy in the vehicle behavior recognition task.","PeriodicalId":262799,"journal":{"name":"Proceedings of the 4th World Symposium on Software Engineering","volume":"14 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Behavior Recognition Network for Small Target Vehicle based on Attention Mechanism\",\"authors\":\"Zhuoyi Wu, Zhiqiang Ma, Caijilahu Bao, Leixiao Li, Xiaoxu Zhang, Fangyuan Zhu\",\"doi\":\"10.1145/3568364.3568365\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Vehicle behavior recognition is one of the most important research fields in intelligent transportation. In the field of traffic surveillance videos, target vehicles only occupy small proportion of the video frame,making it difficult for the network to extract key features of small target vehicles. To solve this problem, this paper constructed vehicle-3C, a video dataset for vehicle behavior recognition, and was combined with deep learning theory. As a result, it proposed a vehicle behavior recognition method based on the theory of Two-Stream with Attention Network (TSAN). For this method, the two-stream convolutional network is used as a basic framework and attention units are embedded in it to extract temporal (motion) features and spatial features of target vehicles in traffic surveillance videos, and then the temporal and spatial features are fused for category judgment. The results showed that TSAN could achieve 81.8% identification accuracy in vehicle-3C dataset, which was better than other behavior recognition methods based on deep learning. In addition, TSAN also achieves a recognition accuracy of 77.2% in dataset UCF-101, which verifies the generalization performance of the network. Experimental results showed that TSAN could accurately extract and effectively fuse the temporal and spatial features of the foreground target vehicle in the video and achieve high recognition accuracy in the vehicle behavior recognition task.\",\"PeriodicalId\":262799,\"journal\":{\"name\":\"Proceedings of the 4th World Symposium on Software Engineering\",\"volume\":\"14 3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 4th World Symposium on Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3568364.3568365\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th World Symposium on Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3568364.3568365","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Behavior Recognition Network for Small Target Vehicle based on Attention Mechanism
Vehicle behavior recognition is one of the most important research fields in intelligent transportation. In the field of traffic surveillance videos, target vehicles only occupy small proportion of the video frame,making it difficult for the network to extract key features of small target vehicles. To solve this problem, this paper constructed vehicle-3C, a video dataset for vehicle behavior recognition, and was combined with deep learning theory. As a result, it proposed a vehicle behavior recognition method based on the theory of Two-Stream with Attention Network (TSAN). For this method, the two-stream convolutional network is used as a basic framework and attention units are embedded in it to extract temporal (motion) features and spatial features of target vehicles in traffic surveillance videos, and then the temporal and spatial features are fused for category judgment. The results showed that TSAN could achieve 81.8% identification accuracy in vehicle-3C dataset, which was better than other behavior recognition methods based on deep learning. In addition, TSAN also achieves a recognition accuracy of 77.2% in dataset UCF-101, which verifies the generalization performance of the network. Experimental results showed that TSAN could accurately extract and effectively fuse the temporal and spatial features of the foreground target vehicle in the video and achieve high recognition accuracy in the vehicle behavior recognition task.