Bing Du, Ji Zhao, Mingyuan Cao, Mingyang Li, Hailong Yu
{"title":"Behavior Recognition Based on Improved Faster RCNN","authors":"Bing Du, Ji Zhao, Mingyuan Cao, Mingyang Li, Hailong Yu","doi":"10.1109/CISP-BMEI53629.2021.9624427","DOIUrl":null,"url":null,"abstract":"We divide the recognition process into “object detection” and “behavior prediction”. Firstly, all objects in the image are detected, and then the detection results are used as the input of the behavior recognition part to predict the interaction actions between objects. In the process of feature extraction, we add extra parameters to the sampling point of each convolution kernel to give the characteristic of convolution kernel deformation, so that the network has better adaptability to complex scenes. In the detection of target, the attention mechanism is combined with ResNet network, and the network structure is changed from “post-activation” to “pre-activation”, which makes the suggestion box have certain screening ability and avoids the phenomenon of overfitting. In action prediction, the network takes the instance object in the feature map as the center, the interactive objects around which are detected according to the appearance characteristics and attention weight of the object, and the action scores between them are predicted. Finally, our network is trained on the enhanced COCO dataset. Compared to traditional methods. The proposed method can well detect the actions in the image, and the mAP reaches 67.2%, an increase of nearly 14 percentage points, which is of high experimental value.","PeriodicalId":131256,"journal":{"name":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISP-BMEI53629.2021.9624427","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We divide the recognition process into “object detection” and “behavior prediction”. Firstly, all objects in the image are detected, and then the detection results are used as the input of the behavior recognition part to predict the interaction actions between objects. In the process of feature extraction, we add extra parameters to the sampling point of each convolution kernel to give the characteristic of convolution kernel deformation, so that the network has better adaptability to complex scenes. In the detection of target, the attention mechanism is combined with ResNet network, and the network structure is changed from “post-activation” to “pre-activation”, which makes the suggestion box have certain screening ability and avoids the phenomenon of overfitting. In action prediction, the network takes the instance object in the feature map as the center, the interactive objects around which are detected according to the appearance characteristics and attention weight of the object, and the action scores between them are predicted. Finally, our network is trained on the enhanced COCO dataset. Compared to traditional methods. The proposed method can well detect the actions in the image, and the mAP reaches 67.2%, an increase of nearly 14 percentage points, which is of high experimental value.