A Splittable DNN-Based Object Detector for Edge-Cloud Collaborative Real-Time Video Inference

2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) Pub Date : 2021-11-16 DOI:10.1109/AVSS52988.2021.9663806

Joochan Lee, Yongwoo Kim, Sungtae Moon, J. Ko

{"title":"A Splittable DNN-Based Object Detector for Edge-Cloud Collaborative Real-Time Video Inference","authors":"Joochan Lee, Yongwoo Kim, Sungtae Moon, J. Ko","doi":"10.1109/AVSS52988.2021.9663806","DOIUrl":null,"url":null,"abstract":"While recent advances in deep neural networks (DNNs) enabled remarkable performance on various computer vision tasks, it is challenging for edge devices to perform real-time inference of complex DNN models due to their stringent resource constraint. To enhance the inference throughput, recent studies proposed collaborative intelligence (CI) that splits DNN computation into edge and cloud platforms, mostly for simple tasks such as image classification. However, for general DNN-based object detectors with a branching architecture, CI is highly restricted because of a significant feature transmission overhead. To solve this issue, this paper proposes a splittable object detector that enables edge-cloud collaborative real-time video inference. The proposed architecture includes a feature reconstruction network that can generate multiple features required for detection using a small-sized feature from the edge-side extractor. Asymmetric scaling on the feature extractor and reconstructor further reduces the transmitted feature size and edge inference latency, while maintaining detection accuracy. The performance evaluation using Yolov5 shows that the proposed model achieves 28 fps (2.45X and 1.56X higher than edge-only and cloud-only inference, respectively), on the NVIDIA Jetson TX2 platform in WiFi environment.","PeriodicalId":246327,"journal":{"name":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"258 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AVSS52988.2021.9663806","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

While recent advances in deep neural networks (DNNs) enabled remarkable performance on various computer vision tasks, it is challenging for edge devices to perform real-time inference of complex DNN models due to their stringent resource constraint. To enhance the inference throughput, recent studies proposed collaborative intelligence (CI) that splits DNN computation into edge and cloud platforms, mostly for simple tasks such as image classification. However, for general DNN-based object detectors with a branching architecture, CI is highly restricted because of a significant feature transmission overhead. To solve this issue, this paper proposes a splittable object detector that enables edge-cloud collaborative real-time video inference. The proposed architecture includes a feature reconstruction network that can generate multiple features required for detection using a small-sized feature from the edge-side extractor. Asymmetric scaling on the feature extractor and reconstructor further reduces the transmitted feature size and edge inference latency, while maintaining detection accuracy. The performance evaluation using Yolov5 shows that the proposed model achieves 28 fps (2.45X and 1.56X higher than edge-only and cloud-only inference, respectively), on the NVIDIA Jetson TX2 platform in WiFi environment.

查看原文本刊更多论文

基于可分割dnn的边缘云协同实时视频推断目标检测器

虽然深度神经网络(DNN)的最新进展在各种计算机视觉任务上取得了显着的性能，但由于其严格的资源约束，边缘设备对复杂的DNN模型进行实时推理是具有挑战性的。为了提高推理吞吐量，最近的研究提出了协作智能(CI)，将深度神经网络计算分为边缘和云平台，主要用于图像分类等简单任务。然而，对于具有分支架构的一般基于dnn的目标检测器，由于显著的特征传输开销，CI受到高度限制。为了解决这一问题，本文提出了一种可分割的目标检测器，实现边缘云协同实时视频推理。所提出的体系结构包括一个特征重建网络，该网络可以使用边缘提取器的小尺寸特征生成检测所需的多个特征。特征提取器和重构器的非对称缩放进一步减小了传输的特征大小和边缘推断延迟，同时保持了检测精度。使用Yolov5进行性能评估表明，在NVIDIA Jetson TX2平台上，在WiFi环境下，所提出的模型达到了28 fps(分别比edge-only和cloud-only推理高2.45X和1.56X)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)

自引率

0.00%

发文量