{"title":"基于VGG16+SSD的640x480图像实时目标检测","authors":"Hyeong-Ju Kang","doi":"10.1109/ICFPT47387.2019.00082","DOIUrl":null,"url":null,"abstract":"Convolutional neural networks (CNNs) show high performance in computer vision tasks including object detection, but a lot of weight storage and computation requirement prohibits real-time processing, 30 frames per second (FPS). This demonstration will show an CNN accelerator that can process real-time object detection on the 640x480 image. A high performance, complex CNN was implemented, single-shot multibox detector (SSD) with VGG16. The number of weights is reduced by a pruning scheme. For the higher utilization of operators, the accelerator-aware pruning was applied. The weights of the pruned network can be entirely stored in the internal memory. The proposed design reaches 42 FPS on XC7VX690T FPGA, showing VOC07 test mAP of 78.13%.","PeriodicalId":241340,"journal":{"name":"2019 International Conference on Field-Programmable Technology (ICFPT)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Real-Time Object Detection on 640x480 Image With VGG16+SSD\",\"authors\":\"Hyeong-Ju Kang\",\"doi\":\"10.1109/ICFPT47387.2019.00082\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolutional neural networks (CNNs) show high performance in computer vision tasks including object detection, but a lot of weight storage and computation requirement prohibits real-time processing, 30 frames per second (FPS). This demonstration will show an CNN accelerator that can process real-time object detection on the 640x480 image. A high performance, complex CNN was implemented, single-shot multibox detector (SSD) with VGG16. The number of weights is reduced by a pruning scheme. For the higher utilization of operators, the accelerator-aware pruning was applied. The weights of the pruned network can be entirely stored in the internal memory. The proposed design reaches 42 FPS on XC7VX690T FPGA, showing VOC07 test mAP of 78.13%.\",\"PeriodicalId\":241340,\"journal\":{\"name\":\"2019 International Conference on Field-Programmable Technology (ICFPT)\",\"volume\":\"102 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Field-Programmable Technology (ICFPT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICFPT47387.2019.00082\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Field-Programmable Technology (ICFPT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFPT47387.2019.00082","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Real-Time Object Detection on 640x480 Image With VGG16+SSD
Convolutional neural networks (CNNs) show high performance in computer vision tasks including object detection, but a lot of weight storage and computation requirement prohibits real-time processing, 30 frames per second (FPS). This demonstration will show an CNN accelerator that can process real-time object detection on the 640x480 image. A high performance, complex CNN was implemented, single-shot multibox detector (SSD) with VGG16. The number of weights is reduced by a pruning scheme. For the higher utilization of operators, the accelerator-aware pruning was applied. The weights of the pruned network can be entirely stored in the internal memory. The proposed design reaches 42 FPS on XC7VX690T FPGA, showing VOC07 test mAP of 78.13%.