{"title":"Improved SSD Algorithm Based on Multi-scale Feature Fusion and Residual Attention Mechanism","authors":"Yongquan Zhao","doi":"10.1109/CTISC52352.2021.00024","DOIUrl":null,"url":null,"abstract":"Convolutional neural network (CNN) has led to significant progress in object detection. In order to detect the objects in various sizes, the object detectors often exploit the hierarchy of the multi-scale feature maps called feature pyramid, which is readily obtained by the CNN architecture. However, such feature maps do not fully consider the supplementary effect of contextual information on semantics. In this work, we proposed a feature fusion method of residual attention based on the SSD benchmark network call Improved SSD to make full use of context information to improve the characterization ability of feature maps. Besides, we use the residual attention mechanism to reinforce the key features to further improve the detector performance. The experiment result on benchmark dataset PASCAL VOC shows that the map of the proposed method with input image sizes of 300×300 and 512×512 is 78.8% and 80.7%.","PeriodicalId":268378,"journal":{"name":"2021 3rd International Conference on Advances in Computer Technology, Information Science and Communication (CTISC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 3rd International Conference on Advances in Computer Technology, Information Science and Communication (CTISC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CTISC52352.2021.00024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Convolutional neural network (CNN) has led to significant progress in object detection. In order to detect the objects in various sizes, the object detectors often exploit the hierarchy of the multi-scale feature maps called feature pyramid, which is readily obtained by the CNN architecture. However, such feature maps do not fully consider the supplementary effect of contextual information on semantics. In this work, we proposed a feature fusion method of residual attention based on the SSD benchmark network call Improved SSD to make full use of context information to improve the characterization ability of feature maps. Besides, we use the residual attention mechanism to reinforce the key features to further improve the detector performance. The experiment result on benchmark dataset PASCAL VOC shows that the map of the proposed method with input image sizes of 300×300 and 512×512 is 78.8% and 80.7%.