{"title":"一种基于边缘的实时目标检测方法","authors":"A. Ahmadinia, Jaabaal Shah","doi":"10.1109/ICMLA55696.2022.00075","DOIUrl":null,"url":null,"abstract":"This paper looks at performance bottlenecks of real-time object detection on edge devices. The \"You only look once v4\" (YOLOv4) is currently one of the leading state-of-the-art models for real-time object detection, and its tiny version: YOLOv4-tiny, is designed for edge devices. To improve object detection accuracy without sacrificing detection speed, we propose an object detection method based on YOLOv4-tiny and VGG-Net. First, we implement the mosaic data augmentation and Mish activation function to increase the generalization ability of the proposed model, making it more robust. Secondly, to enhance the richness of the features extracted, an extra 3x3 convolution layer is added in a way that two successive 3x3 convolutions are used to obtain 5x5 receptive fields. This would enable us to extract global features in the first CSP (Cross Stage Partial Network) Block and restructure the connections of the subsequent layers to have the same effect on the next CSP blocks. Evaluation results show that the proposed model has comparable performance and memory footprint but significantly greater accuracy than YOLOv4-tiny. Also, the proposed tiny model has similar performance to YOLOv4-tiny, and improves accuracy with much lower memory overhead, which makes it an ideal solution for real-time object detection, especially on edge devices.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Edge-based Real-Time Object Detection\",\"authors\":\"A. Ahmadinia, Jaabaal Shah\",\"doi\":\"10.1109/ICMLA55696.2022.00075\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper looks at performance bottlenecks of real-time object detection on edge devices. The \\\"You only look once v4\\\" (YOLOv4) is currently one of the leading state-of-the-art models for real-time object detection, and its tiny version: YOLOv4-tiny, is designed for edge devices. To improve object detection accuracy without sacrificing detection speed, we propose an object detection method based on YOLOv4-tiny and VGG-Net. First, we implement the mosaic data augmentation and Mish activation function to increase the generalization ability of the proposed model, making it more robust. Secondly, to enhance the richness of the features extracted, an extra 3x3 convolution layer is added in a way that two successive 3x3 convolutions are used to obtain 5x5 receptive fields. This would enable us to extract global features in the first CSP (Cross Stage Partial Network) Block and restructure the connections of the subsequent layers to have the same effect on the next CSP blocks. Evaluation results show that the proposed model has comparable performance and memory footprint but significantly greater accuracy than YOLOv4-tiny. Also, the proposed tiny model has similar performance to YOLOv4-tiny, and improves accuracy with much lower memory overhead, which makes it an ideal solution for real-time object detection, especially on edge devices.\",\"PeriodicalId\":128160,\"journal\":{\"name\":\"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"58 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA55696.2022.00075\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA55696.2022.00075","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
This paper looks at performance bottlenecks of real-time object detection on edge devices. The "You only look once v4" (YOLOv4) is currently one of the leading state-of-the-art models for real-time object detection, and its tiny version: YOLOv4-tiny, is designed for edge devices. To improve object detection accuracy without sacrificing detection speed, we propose an object detection method based on YOLOv4-tiny and VGG-Net. First, we implement the mosaic data augmentation and Mish activation function to increase the generalization ability of the proposed model, making it more robust. Secondly, to enhance the richness of the features extracted, an extra 3x3 convolution layer is added in a way that two successive 3x3 convolutions are used to obtain 5x5 receptive fields. This would enable us to extract global features in the first CSP (Cross Stage Partial Network) Block and restructure the connections of the subsequent layers to have the same effect on the next CSP blocks. Evaluation results show that the proposed model has comparable performance and memory footprint but significantly greater accuracy than YOLOv4-tiny. Also, the proposed tiny model has similar performance to YOLOv4-tiny, and improves accuracy with much lower memory overhead, which makes it an ideal solution for real-time object detection, especially on edge devices.