{"title":"基于视频提议和轨迹的动作识别","authors":"Lei Qi, Xiaoqiang Lu, Xuelong Li","doi":"10.1145/3271553.3271563","DOIUrl":null,"url":null,"abstract":"As a popular research field in computer vision community, human action recognition in videos is a challenging task. In recent years, trajectory based methods have been proven effective for action recognition. However, because trajectory is generated around motion region, trajectory based methods often only pay attention to regions with high motion salience in video and ignore motionless but semantic objects. To compensate the shortage of trajectory based methods, video proposal is utilized for its ability to discover semantic object in this paper. In the proposed method, video proposal and trajectory are extracted simultaneously to capture motion information and object information. The proposed method can be divided into three steps: 1) trajectories and video proposals are extracted from video to capture motion information and object information respectively; 2) a trained Convolution Neural Network (CNN) model is employed to describe the extracted trajectories and video proposals; 3) the holistic representation of video is constructed by Fisher Vector model and then input to classifier to get the action label. The complementarity between trajectory and video proposal enables the discrimination power of the proposed method for kinds of actions. The proposed method is evaluated on UCF101 and HMDB51, on which the promising results prove the effectiveness of the proposed method.","PeriodicalId":414782,"journal":{"name":"Proceedings of the 2nd International Conference on Vision, Image and Signal Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Action Recognition by Jointly Using Video Proposal and Trajectory\",\"authors\":\"Lei Qi, Xiaoqiang Lu, Xuelong Li\",\"doi\":\"10.1145/3271553.3271563\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a popular research field in computer vision community, human action recognition in videos is a challenging task. In recent years, trajectory based methods have been proven effective for action recognition. However, because trajectory is generated around motion region, trajectory based methods often only pay attention to regions with high motion salience in video and ignore motionless but semantic objects. To compensate the shortage of trajectory based methods, video proposal is utilized for its ability to discover semantic object in this paper. In the proposed method, video proposal and trajectory are extracted simultaneously to capture motion information and object information. The proposed method can be divided into three steps: 1) trajectories and video proposals are extracted from video to capture motion information and object information respectively; 2) a trained Convolution Neural Network (CNN) model is employed to describe the extracted trajectories and video proposals; 3) the holistic representation of video is constructed by Fisher Vector model and then input to classifier to get the action label. The complementarity between trajectory and video proposal enables the discrimination power of the proposed method for kinds of actions. The proposed method is evaluated on UCF101 and HMDB51, on which the promising results prove the effectiveness of the proposed method.\",\"PeriodicalId\":414782,\"journal\":{\"name\":\"Proceedings of the 2nd International Conference on Vision, Image and Signal Processing\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2nd International Conference on Vision, Image and Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3271553.3271563\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Conference on Vision, Image and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3271553.3271563","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Action Recognition by Jointly Using Video Proposal and Trajectory
As a popular research field in computer vision community, human action recognition in videos is a challenging task. In recent years, trajectory based methods have been proven effective for action recognition. However, because trajectory is generated around motion region, trajectory based methods often only pay attention to regions with high motion salience in video and ignore motionless but semantic objects. To compensate the shortage of trajectory based methods, video proposal is utilized for its ability to discover semantic object in this paper. In the proposed method, video proposal and trajectory are extracted simultaneously to capture motion information and object information. The proposed method can be divided into three steps: 1) trajectories and video proposals are extracted from video to capture motion information and object information respectively; 2) a trained Convolution Neural Network (CNN) model is employed to describe the extracted trajectories and video proposals; 3) the holistic representation of video is constructed by Fisher Vector model and then input to classifier to get the action label. The complementarity between trajectory and video proposal enables the discrimination power of the proposed method for kinds of actions. The proposed method is evaluated on UCF101 and HMDB51, on which the promising results prove the effectiveness of the proposed method.