{"title":"Improved Faster R-CNN for Automatic Video Annotations","authors":"Qing Liu, Ziyu Xue, Lei Wang, Peiyu Guo","doi":"10.1109/ICCIA52886.2021.00048","DOIUrl":null,"url":null,"abstract":"With the development of digital multimedia, how to manage a large amount of existing and incremental media assets has become an urgent problem. With the development of machine learning, the use of object detection framework to achieve intelligent cataloging and intelligent management of media assets will greatly improve work efficiency. Currently, Faster R-CNN is used quite often in intelligent cataloging, but the framework has the problem of low accuracy using feature extraction networks. In view of this, based on the Faster R-CNN object detection framework with VGG-16 as the feature extraction network, a novel object detection framework (DF-Faster R-CNN) was designed with ResNet-101 as the feature extraction network in this paper, which improved the recognition precision of the object detection framework. Compared with the current mainstream methods, the improved model proposed in this paper can effectively identify objects with overlap, occlusion and blur objects in the video, and is more suitable for image recognition in film and television works. The test results of this method on the MSCOCO data set show that compared with mainstream framework such as Fast R-CNN, Faster R-CNN, Pelee, and SIN, the method is significantly improved on mAP, and also has a higher precision in the ten types of object recognition experiments in PASCAL VOC dataset.","PeriodicalId":269269,"journal":{"name":"2021 6th International Conference on Computational Intelligence and Applications (ICCIA)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 6th International Conference on Computational Intelligence and Applications (ICCIA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIA52886.2021.00048","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
With the development of digital multimedia, how to manage a large amount of existing and incremental media assets has become an urgent problem. With the development of machine learning, the use of object detection framework to achieve intelligent cataloging and intelligent management of media assets will greatly improve work efficiency. Currently, Faster R-CNN is used quite often in intelligent cataloging, but the framework has the problem of low accuracy using feature extraction networks. In view of this, based on the Faster R-CNN object detection framework with VGG-16 as the feature extraction network, a novel object detection framework (DF-Faster R-CNN) was designed with ResNet-101 as the feature extraction network in this paper, which improved the recognition precision of the object detection framework. Compared with the current mainstream methods, the improved model proposed in this paper can effectively identify objects with overlap, occlusion and blur objects in the video, and is more suitable for image recognition in film and television works. The test results of this method on the MSCOCO data set show that compared with mainstream framework such as Fast R-CNN, Faster R-CNN, Pelee, and SIN, the method is significantly improved on mAP, and also has a higher precision in the ten types of object recognition experiments in PASCAL VOC dataset.