一种优化边缘计算中深度学习目标检测的方法

2020 International Conference on Information and Communication Technology Convergence (ICTC) Pub Date : 2020-10-21 DOI:10.1109/ICTC49870.2020.9289529

Ryangsoo Kim, Geonyong Kim, Heedo Kim, Giha Yoon, Hark Yoo

{"title":"一种优化边缘计算中深度学习目标检测的方法","authors":"Ryangsoo Kim, Geonyong Kim, Heedo Kim, Giha Yoon, Hark Yoo","doi":"10.1109/ICTC49870.2020.9289529","DOIUrl":null,"url":null,"abstract":"Recently, edge computing has received considerable attention as a promising solution to provide deep learning-based video analysis services in real-time. However, due to the limited computation capability of the data processing units (such as CPUs, GPUs, and specialized accelerators) embedded in the edge devices, the question of how to use the limited resources of the edge devices is one of the most pressing issues affecting deep learning-based video analysis service efficiency. In this paper, we introduce a practical approach to optimize deep learning object detection at the edge devices embedding CPUs and GPUs. The proposed approach adopts TVM, an automated end-to-end deep learning compiler that automatically optimizes deep learning workloads with respect to hardware-specific characteristics. In addition, task-level pipeline parallelism is applied to maximize resource utilization of the CPUs and GPUs so as to improve overall object detection performance. Through experiment results, we show that the proposed approach achieves performance improvement for detecting objects on multiple video streams in terms of frame per second.","PeriodicalId":282243,"journal":{"name":"2020 International Conference on Information and Communication Technology Convergence (ICTC)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A Method for Optimizing Deep Learning Object Detection in Edge Computing\",\"authors\":\"Ryangsoo Kim, Geonyong Kim, Heedo Kim, Giha Yoon, Hark Yoo\",\"doi\":\"10.1109/ICTC49870.2020.9289529\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, edge computing has received considerable attention as a promising solution to provide deep learning-based video analysis services in real-time. However, due to the limited computation capability of the data processing units (such as CPUs, GPUs, and specialized accelerators) embedded in the edge devices, the question of how to use the limited resources of the edge devices is one of the most pressing issues affecting deep learning-based video analysis service efficiency. In this paper, we introduce a practical approach to optimize deep learning object detection at the edge devices embedding CPUs and GPUs. The proposed approach adopts TVM, an automated end-to-end deep learning compiler that automatically optimizes deep learning workloads with respect to hardware-specific characteristics. In addition, task-level pipeline parallelism is applied to maximize resource utilization of the CPUs and GPUs so as to improve overall object detection performance. Through experiment results, we show that the proposed approach achieves performance improvement for detecting objects on multiple video streams in terms of frame per second.\",\"PeriodicalId\":282243,\"journal\":{\"name\":\"2020 International Conference on Information and Communication Technology Convergence (ICTC)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Information and Communication Technology Convergence (ICTC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTC49870.2020.9289529\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Information and Communication Technology Convergence (ICTC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTC49870.2020.9289529","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

最近，边缘计算作为提供基于深度学习的实时视频分析服务的有前途的解决方案受到了相当大的关注。然而，由于嵌入在边缘设备中的数据处理单元(如cpu、gpu和专用加速器)的计算能力有限，如何利用有限的边缘设备资源是影响基于深度学习的视频分析服务效率的最紧迫问题之一。在本文中，我们介绍了一种在嵌入cpu和gpu的边缘设备上优化深度学习对象检测的实用方法。提出的方法采用TVM，这是一种自动化的端到端深度学习编译器，可以根据特定硬件的特征自动优化深度学习工作负载。此外，采用任务级流水线并行，使cpu和gpu的资源利用率最大化，从而提高整体目标检测性能。实验结果表明，该方法在多视频流上的目标检测性能有所提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Method for Optimizing Deep Learning Object Detection in Edge Computing

Recently, edge computing has received considerable attention as a promising solution to provide deep learning-based video analysis services in real-time. However, due to the limited computation capability of the data processing units (such as CPUs, GPUs, and specialized accelerators) embedded in the edge devices, the question of how to use the limited resources of the edge devices is one of the most pressing issues affecting deep learning-based video analysis service efficiency. In this paper, we introduce a practical approach to optimize deep learning object detection at the edge devices embedding CPUs and GPUs. The proposed approach adopts TVM, an automated end-to-end deep learning compiler that automatically optimizes deep learning workloads with respect to hardware-specific characteristics. In addition, task-level pipeline parallelism is applied to maximize resource utilization of the CPUs and GPUs so as to improve overall object detection performance. Through experiment results, we show that the proposed approach achieves performance improvement for detecting objects on multiple video streams in terms of frame per second.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 International Conference on Information and Communication Technology Convergence (ICTC)

自引率

0.00%

发文量