Multi-Agent Systems for Collaborative Inference Based on Deep Policy Q-Inference Network

IF 2.9 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Grid Computing Pub Date : 2024-02-29 DOI:10.1007/s10723-024-09750-w

Shangshang Wang, Yuqin Jing, Kezhu Wang, Xue Wang

{"title":"Multi-Agent Systems for Collaborative Inference Based on Deep Policy Q-Inference Network","authors":"Shangshang Wang, Yuqin Jing, Kezhu Wang, Xue Wang","doi":"10.1007/s10723-024-09750-w","DOIUrl":null,"url":null,"abstract":"<p>This study tackles the problem of increasing efficiency and scalability in deep neural network (DNN) systems by employing collaborative inference, an approach that is gaining popularity because to its ability to maximize computational resources. It involves splitting a pre-trained DNN model into two parts and running them separately on user equipment (UE) and edge servers. This approach is advantageous because it results in faster and more energy-efficient inference, as computation can be offloaded to edge servers rather than relying solely on UEs. However, a significant challenge of collaborative belief is the dynamic coupling of DNN layers, which makes it difficult to separate and run the layers independently. To address this challenge, we proposed a novel approach to optimize collaborative inference in a multi-agent scenario where a single-edge server coordinates the assumption of multiple UEs. Our proposed method suggests using an autoencoder-based technique to reduce the size of intermediary features and constructing tasks using the deep policy inference Q-inference network’s overhead (DPIQN). To optimize the collaborative inference, employ the Deep Recurrent Policy Inference Q-Network (DRPIQN) technique, which allows for a hybrid action space. The results of the tests demonstrate that this approach can significantly reduce inference latency by up to 56% and energy usage by up to 72% on various networks. Overall, this proposed approach provides an efficient and effective method for implementing collaborative inference in multi-agent scenarios, which could have significant implications for developing DNN systems.</p>","PeriodicalId":54817,"journal":{"name":"Journal of Grid Computing","volume":"77 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Grid Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10723-024-09750-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

This study tackles the problem of increasing efficiency and scalability in deep neural network (DNN) systems by employing collaborative inference, an approach that is gaining popularity because to its ability to maximize computational resources. It involves splitting a pre-trained DNN model into two parts and running them separately on user equipment (UE) and edge servers. This approach is advantageous because it results in faster and more energy-efficient inference, as computation can be offloaded to edge servers rather than relying solely on UEs. However, a significant challenge of collaborative belief is the dynamic coupling of DNN layers, which makes it difficult to separate and run the layers independently. To address this challenge, we proposed a novel approach to optimize collaborative inference in a multi-agent scenario where a single-edge server coordinates the assumption of multiple UEs. Our proposed method suggests using an autoencoder-based technique to reduce the size of intermediary features and constructing tasks using the deep policy inference Q-inference network’s overhead (DPIQN). To optimize the collaborative inference, employ the Deep Recurrent Policy Inference Q-Network (DRPIQN) technique, which allows for a hybrid action space. The results of the tests demonstrate that this approach can significantly reduce inference latency by up to 56% and energy usage by up to 72% on various networks. Overall, this proposed approach provides an efficient and effective method for implementing collaborative inference in multi-agent scenarios, which could have significant implications for developing DNN systems.

查看原文本刊更多论文

基于深度策略 Q 推理网络的协作推理多代理系统

本研究通过采用协作推理来解决提高深度神经网络（DNN）系统效率和可扩展性的问题，协作推理是一种因能最大限度利用计算资源而日益流行的方法。它将预先训练好的 DNN 模型分成两部分，分别在用户设备（UE）和边缘服务器上运行。这种方法的优点是推理速度更快、能效更高，因为计算可以卸载到边缘服务器上，而不是完全依赖 UE。然而，协同信念面临的一个重大挑战是 DNN 各层的动态耦合，这使得各层难以分离和独立运行。为了应对这一挑战，我们提出了一种新方法，以优化多代理场景中的协作推理，即由单个边缘服务器协调多个 UE 的假设。我们提出的方法建议使用基于自动编码器的技术来减少中间特征的大小，并使用深度策略推理 Q-推理网络的开销（DPIQN）来构建任务。为了优化协作推理，采用了深度递归策略推理 Q 网络（DRPIQN）技术，该技术允许混合行动空间。测试结果表明，在各种网络上，这种方法可以将推理延迟大幅减少 56%，将能量消耗大幅减少 72%。总之，这种拟议方法为在多代理场景中实施协作推理提供了一种高效、有效的方法，对开发 DNN 系统具有重要意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Grid Computing COMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, THEORY & METHODS

CiteScore

8.70

自引率

9.10%

发文量

审稿时长

>12 weeks

期刊介绍： Grid Computing is an emerging technology that enables large-scale resource sharing and coordinated problem solving within distributed, often loosely coordinated groups-what are sometimes termed "virtual organizations. By providing scalable, secure, high-performance mechanisms for discovering and negotiating access to remote resources, Grid technologies promise to make it possible for scientific collaborations to share resources on an unprecedented scale, and for geographically distributed groups to work together in ways that were previously impossible. Similar technologies are being adopted within industry, where they serve as important building blocks for emerging service provider infrastructures. Even though the advantages of this technology for classes of applications have been acknowledged, research in a variety of disciplines, including not only multiple domains of computer science (networking, middleware, programming, algorithms) but also application disciplines themselves, as well as such areas as sociology and economics, is needed to broaden the applicability and scope of the current body of knowledge.