Ryangsoo Kim, Geonyong Kim, Heedo Kim, Giha Yoon, Hark Yoo
{"title":"一种优化边缘计算中深度学习目标检测的方法","authors":"Ryangsoo Kim, Geonyong Kim, Heedo Kim, Giha Yoon, Hark Yoo","doi":"10.1109/ICTC49870.2020.9289529","DOIUrl":null,"url":null,"abstract":"Recently, edge computing has received considerable attention as a promising solution to provide deep learning-based video analysis services in real-time. However, due to the limited computation capability of the data processing units (such as CPUs, GPUs, and specialized accelerators) embedded in the edge devices, the question of how to use the limited resources of the edge devices is one of the most pressing issues affecting deep learning-based video analysis service efficiency. In this paper, we introduce a practical approach to optimize deep learning object detection at the edge devices embedding CPUs and GPUs. The proposed approach adopts TVM, an automated end-to-end deep learning compiler that automatically optimizes deep learning workloads with respect to hardware-specific characteristics. In addition, task-level pipeline parallelism is applied to maximize resource utilization of the CPUs and GPUs so as to improve overall object detection performance. Through experiment results, we show that the proposed approach achieves performance improvement for detecting objects on multiple video streams in terms of frame per second.","PeriodicalId":282243,"journal":{"name":"2020 International Conference on Information and Communication Technology Convergence (ICTC)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A Method for Optimizing Deep Learning Object Detection in Edge Computing\",\"authors\":\"Ryangsoo Kim, Geonyong Kim, Heedo Kim, Giha Yoon, Hark Yoo\",\"doi\":\"10.1109/ICTC49870.2020.9289529\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, edge computing has received considerable attention as a promising solution to provide deep learning-based video analysis services in real-time. However, due to the limited computation capability of the data processing units (such as CPUs, GPUs, and specialized accelerators) embedded in the edge devices, the question of how to use the limited resources of the edge devices is one of the most pressing issues affecting deep learning-based video analysis service efficiency. In this paper, we introduce a practical approach to optimize deep learning object detection at the edge devices embedding CPUs and GPUs. The proposed approach adopts TVM, an automated end-to-end deep learning compiler that automatically optimizes deep learning workloads with respect to hardware-specific characteristics. In addition, task-level pipeline parallelism is applied to maximize resource utilization of the CPUs and GPUs so as to improve overall object detection performance. Through experiment results, we show that the proposed approach achieves performance improvement for detecting objects on multiple video streams in terms of frame per second.\",\"PeriodicalId\":282243,\"journal\":{\"name\":\"2020 International Conference on Information and Communication Technology Convergence (ICTC)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Information and Communication Technology Convergence (ICTC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTC49870.2020.9289529\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Information and Communication Technology Convergence (ICTC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTC49870.2020.9289529","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Method for Optimizing Deep Learning Object Detection in Edge Computing
Recently, edge computing has received considerable attention as a promising solution to provide deep learning-based video analysis services in real-time. However, due to the limited computation capability of the data processing units (such as CPUs, GPUs, and specialized accelerators) embedded in the edge devices, the question of how to use the limited resources of the edge devices is one of the most pressing issues affecting deep learning-based video analysis service efficiency. In this paper, we introduce a practical approach to optimize deep learning object detection at the edge devices embedding CPUs and GPUs. The proposed approach adopts TVM, an automated end-to-end deep learning compiler that automatically optimizes deep learning workloads with respect to hardware-specific characteristics. In addition, task-level pipeline parallelism is applied to maximize resource utilization of the CPUs and GPUs so as to improve overall object detection performance. Through experiment results, we show that the proposed approach achieves performance improvement for detecting objects on multiple video streams in terms of frame per second.