基于边缘和云之间图像流管道推理的自适应DNN划分新方法

Chenchen Ji, Yanjun Wu, Pengpeng Hou, Yang Tai, Jiageng Yu
{"title":"基于边缘和云之间图像流管道推理的自适应DNN划分新方法","authors":"Chenchen Ji, Yanjun Wu, Pengpeng Hou, Yang Tai, Jiageng Yu","doi":"10.1109/cniot55862.2022.00021","DOIUrl":null,"url":null,"abstract":"The cloud-only and edge-computing approaches have recently been proposed to satisfy the requirements of complex neural networks. However, the cloud-only approach generates a latency challenge because of the high data volumes that must be sent to a centralized location in the cloud. Less-powerful edge computing resources require a compression model for computation reduction, which degrades the model trading accuracy. To address this challenge, deep neural network (DNN) partitioning has become a recent trend, with DNN models being sliced into head and tail portions executed at the mobile edge devices and cloud server, respectively. We propose Edgepipe, a novel partitioning method based on pipeline inference with an image stream to automatically partition DNN computation between the edge device and cloud server, thereby reducing the global latency and enhancing the system-wide real-time performance. This method adapts to various DNN architectures, hardware platforms, and networks. Here, when evaluated on a suite of five DNN applications, Edgepipe achieves average latency speedups of 1.241× and 1.154× over the cloud-only approach and the state-of-the-art approach known as “Neurosurgeon”, respectively.","PeriodicalId":251734,"journal":{"name":"2022 3rd International Conference on Computing, Networks and Internet of Things (CNIOT)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Novel Adaptive DNN Partitioning Method Based on Image-Stream Pipeline Inference between the Edge and Cloud\",\"authors\":\"Chenchen Ji, Yanjun Wu, Pengpeng Hou, Yang Tai, Jiageng Yu\",\"doi\":\"10.1109/cniot55862.2022.00021\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The cloud-only and edge-computing approaches have recently been proposed to satisfy the requirements of complex neural networks. However, the cloud-only approach generates a latency challenge because of the high data volumes that must be sent to a centralized location in the cloud. Less-powerful edge computing resources require a compression model for computation reduction, which degrades the model trading accuracy. To address this challenge, deep neural network (DNN) partitioning has become a recent trend, with DNN models being sliced into head and tail portions executed at the mobile edge devices and cloud server, respectively. We propose Edgepipe, a novel partitioning method based on pipeline inference with an image stream to automatically partition DNN computation between the edge device and cloud server, thereby reducing the global latency and enhancing the system-wide real-time performance. This method adapts to various DNN architectures, hardware platforms, and networks. Here, when evaluated on a suite of five DNN applications, Edgepipe achieves average latency speedups of 1.241× and 1.154× over the cloud-only approach and the state-of-the-art approach known as “Neurosurgeon”, respectively.\",\"PeriodicalId\":251734,\"journal\":{\"name\":\"2022 3rd International Conference on Computing, Networks and Internet of Things (CNIOT)\",\"volume\":\"134 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 3rd International Conference on Computing, Networks and Internet of Things (CNIOT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/cniot55862.2022.00021\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 3rd International Conference on Computing, Networks and Internet of Things (CNIOT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/cniot55862.2022.00021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

为了满足复杂神经网络的要求,最近提出了纯云计算和边缘计算方法。但是,纯云方法会产生延迟问题,因为必须将高数据量发送到云中的集中位置。较弱的边缘计算资源需要压缩模型来减少计算量,这降低了模型交易的准确性。为了应对这一挑战,深度神经网络(DNN)分区已成为最近的趋势,DNN模型被切割成头部和尾部部分,分别在移动边缘设备和云服务器上执行。提出了一种基于管道推理的基于图像流的分区方法Edgepipe,在边缘设备和云服务器之间自动划分DNN计算,从而减少了全局延迟,提高了全系统的实时性。该方法适用于各种深度神经网络体系结构、硬件平台和网络。在这里,当在一套5个DNN应用程序上进行评估时,Edgepipe的平均延迟速度分别比纯云方法和最先进的“神经外科医生”方法提高了1.241倍和1.154倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Novel Adaptive DNN Partitioning Method Based on Image-Stream Pipeline Inference between the Edge and Cloud
The cloud-only and edge-computing approaches have recently been proposed to satisfy the requirements of complex neural networks. However, the cloud-only approach generates a latency challenge because of the high data volumes that must be sent to a centralized location in the cloud. Less-powerful edge computing resources require a compression model for computation reduction, which degrades the model trading accuracy. To address this challenge, deep neural network (DNN) partitioning has become a recent trend, with DNN models being sliced into head and tail portions executed at the mobile edge devices and cloud server, respectively. We propose Edgepipe, a novel partitioning method based on pipeline inference with an image stream to automatically partition DNN computation between the edge device and cloud server, thereby reducing the global latency and enhancing the system-wide real-time performance. This method adapts to various DNN architectures, hardware platforms, and networks. Here, when evaluated on a suite of five DNN applications, Edgepipe achieves average latency speedups of 1.241× and 1.154× over the cloud-only approach and the state-of-the-art approach known as “Neurosurgeon”, respectively.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信