Self-aware collaborative edge inference with embedded devices for IIoT

IF 6.2 2区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-09-23 DOI:10.1016/j.future.2024.107535

Yifan Chen , Zhuoquan Yu , Yi Jin , Christine Mwase , Xin Hu , Li Da Xu , Zhuo Zou , Lirong Zheng

{"title":"Self-aware collaborative edge inference with embedded devices for IIoT","authors":"Yifan Chen , Zhuoquan Yu , Yi Jin , Christine Mwase , Xin Hu , Li Da Xu , Zhuo Zou , Lirong Zheng","doi":"10.1016/j.future.2024.107535","DOIUrl":null,"url":null,"abstract":"<div><div>Edge inference and other compute-intensive industrial Internet of Things (IIoT) applications suffer from a bad quality of experience due to the limited and heterogeneous computing and communication resources of embedded devices. To tackle these issues, we propose a model partitioning-based self-aware collaborative edge inference framework. Specifically, the device can adaptively adjust the local model inference scheme by sensing the available computing and communication resources of surrounding devices. When the inference latency requirement cannot be met by local computation, the model should be partitioned for collaborative computation on other devices to improve the inference efficiency. Furthermore, for two typical IIoT scenarios, i.e., bursting and stacking tasks, the latency-aware and throughput-aware collaborative inference algorithms are designed, respectively. Via jointly optimizing the partition layer and collaborative device selection, the optimal inference efficiency, characterized by minimum inference latency and maximum inference throughput, can be obtained. Finally, the performance of our proposal is validated through extensive simulations and tests conducted on 10 Raspberry Pi 4Bs using popular models. Specifically, in the case of two collaborative devices, our platform reaches up to 92.59% latency reduction for bursting tasks and 16.19<span><math><mo>×</mo></math></span> throughput growth for stacking tasks. In addition, the divergence between simulations and tests ranges from 1.64% to 9.56% for bursting tasks and from 3.24% to 11.24% for stacking tasks, which indicates that the theoretical performance analyses are solid. For the general case where the data privacy is not considered and the number of collaborative devices is optimally determined, up to 14.76<span><math><mo>×</mo></math></span> throughput speed up and 84.04% latency reduction can be obtained.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"163 ","pages":"Article 107535"},"PeriodicalIF":6.2000,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24004990","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Edge inference and other compute-intensive industrial Internet of Things (IIoT) applications suffer from a bad quality of experience due to the limited and heterogeneous computing and communication resources of embedded devices. To tackle these issues, we propose a model partitioning-based self-aware collaborative edge inference framework. Specifically, the device can adaptively adjust the local model inference scheme by sensing the available computing and communication resources of surrounding devices. When the inference latency requirement cannot be met by local computation, the model should be partitioned for collaborative computation on other devices to improve the inference efficiency. Furthermore, for two typical IIoT scenarios, i.e., bursting and stacking tasks, the latency-aware and throughput-aware collaborative inference algorithms are designed, respectively. Via jointly optimizing the partition layer and collaborative device selection, the optimal inference efficiency, characterized by minimum inference latency and maximum inference throughput, can be obtained. Finally, the performance of our proposal is validated through extensive simulations and tests conducted on 10 Raspberry Pi 4Bs using popular models. Specifically, in the case of two collaborative devices, our platform reaches up to 92.59% latency reduction for bursting tasks and 16.19

\times

throughput growth for stacking tasks. In addition, the divergence between simulations and tests ranges from 1.64% to 9.56% for bursting tasks and from 3.24% to 11.24% for stacking tasks, which indicates that the theoretical performance analyses are solid. For the general case where the data privacy is not considered and the number of collaborative devices is optimally determined, up to 14.76

\times

throughput speed up and 84.04% latency reduction can be obtained.

查看原文本刊更多论文

面向 IIoT 的嵌入式设备自我感知协作边缘推理

由于嵌入式设备的计算和通信资源有限且异构，边缘推理和其他计算密集型工业物联网（IIoT）应用的体验质量很差。为了解决这些问题，我们提出了一种基于模型分区的自感知协作边缘推理框架。具体来说，设备可以通过感知周围设备的可用计算和通信资源，自适应地调整本地模型推理方案。当本地计算无法满足推理延迟要求时，应将模型分割到其他设备上进行协同计算，以提高推理效率。此外，针对突发任务和堆叠任务这两种典型的物联网场景，分别设计了延迟感知协同推理算法和吞吐量感知协同推理算法。通过联合优化分区层和协作设备选择，可以获得以最小推理延迟和最大推理吞吐量为特征的最佳推理效率。最后，我们在 10 个使用流行模型的 Raspberry Pi 4B 上进行了大量模拟和测试，验证了我们建议的性能。具体地说，在两个协作设备的情况下，我们的平台在突发任务中减少了 92.59% 的延迟，在堆叠任务中提高了 16.19 倍的吞吐量。此外，对于突发任务，模拟与测试之间的差异在 1.64% 到 9.56% 之间，对于堆叠任务，差异在 3.24% 到 11.24% 之间，这表明理论性能分析是可靠的。在不考虑数据隐私并优化确定协作设备数量的一般情况下，吞吐量可提高 14.76 倍，延迟可减少 84.04%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Future Generation Computer Systems-The International Journal of Escience 工程技术-计算机：理论方法

CiteScore

19.90

自引率

2.70%

发文量

376

审稿时长

10.6 months

期刊介绍： Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.