Enabling microservices management for Deep Learning applications across the Edge-Cloud Continuum

2021 IEEE 33rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) Pub Date : 2021-10-01 DOI:10.1109/SBAC-PAD53543.2021.00025

Zeina Houmani, Daniel Balouek-Thomert, E. Caron, M. Parashar

{"title":"Enabling microservices management for Deep Learning applications across the Edge-Cloud Continuum","authors":"Zeina Houmani, Daniel Balouek-Thomert, E. Caron, M. Parashar","doi":"10.1109/SBAC-PAD53543.2021.00025","DOIUrl":null,"url":null,"abstract":"Deep Learning has shifted the focus of traditional batch workflows to data-driven feature engineering on streaming data. In particular, the execution of Deep Learning workflows presents expectations of near-real-time results with user-defined acceptable accuracy. Meeting the objectives of such applications across heterogeneous resources located at the edge of the network, the core, and in-between requires managing trade-offs between the accuracy and the urgency of the results. However, current data analysis rarely manages the entire Deep Learning pipeline along the data path, making it complex for developers to implement strategies in realworld deployments. Driven by an object detection use case, this paper presents an architecture for time-critical Deep Learning workflows by providing a data-driven scheduling approach to distribute the pipeline across Edge to Cloud resources. Furthermore, it adopts a data management strategy that reduces the resolution of incoming data when potential trade-off optimizations are available. We illustrate the system's viability through a performance evaluation of the object detection use case on the Grid'5000 testbed. We demonstrate that in a multi-user scenario, with a standard frame rate of 25 frames per second, the system speed-up data analysis up to 54.4% compared to a Cloud-only-based scenario with an analysis accuracy higher than a fixed threshold.","PeriodicalId":142588,"journal":{"name":"2021 IEEE 33rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 33rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SBAC-PAD53543.2021.00025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Deep Learning has shifted the focus of traditional batch workflows to data-driven feature engineering on streaming data. In particular, the execution of Deep Learning workflows presents expectations of near-real-time results with user-defined acceptable accuracy. Meeting the objectives of such applications across heterogeneous resources located at the edge of the network, the core, and in-between requires managing trade-offs between the accuracy and the urgency of the results. However, current data analysis rarely manages the entire Deep Learning pipeline along the data path, making it complex for developers to implement strategies in realworld deployments. Driven by an object detection use case, this paper presents an architecture for time-critical Deep Learning workflows by providing a data-driven scheduling approach to distribute the pipeline across Edge to Cloud resources. Furthermore, it adopts a data management strategy that reduces the resolution of incoming data when potential trade-off optimizations are available. We illustrate the system's viability through a performance evaluation of the object detection use case on the Grid'5000 testbed. We demonstrate that in a multi-user scenario, with a standard frame rate of 25 frames per second, the system speed-up data analysis up to 54.4% compared to a Cloud-only-based scenario with an analysis accuracy higher than a fixed threshold.

查看原文本刊更多论文

为边缘云连续体上的深度学习应用程序提供微服务管理

深度学习将传统批处理工作流的重点转移到流数据的数据驱动特征工程上。特别是，深度学习工作流程的执行呈现出对用户定义的可接受精度的近实时结果的期望。要跨位于网络边缘、核心和中间的异构资源满足此类应用程序的目标，需要管理结果的准确性和紧迫性之间的权衡。然而，目前的数据分析很少沿着数据路径管理整个深度学习管道，这使得开发人员在现实世界的部署中实施策略变得复杂。在对象检测用例的驱动下，本文通过提供数据驱动的调度方法，将管道跨边缘分布到云资源，提出了时间关键型深度学习工作流的架构。此外，它采用了一种数据管理策略，当潜在的权衡优化可用时，该策略可以降低传入数据的分辨率。我们通过在Grid’5000测试平台上对目标检测用例的性能评估来说明系统的可行性。我们证明，在多用户场景中，标准帧率为每秒25帧，与仅基于云的场景相比，系统将数据分析速度提高了54.4%，分析精度高于固定阈值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE 33rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

自引率

0.00%

发文量