边缘计算中基于强化学习的服务放置能量延迟权衡

IF 1.5 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Concurrency and Computation-Practice & Experience Pub Date : 2025-06-06 DOI:10.1002/cpe.70154

Bing Tang, Haiyan Li, Wei Xu, Buqing Cao, Qing Yang

{"title":"边缘计算中基于强化学习的服务放置能量延迟权衡","authors":"Bing Tang, Haiyan Li, Wei Xu, Buqing Cao, Qing Yang","doi":"10.1002/cpe.70154","DOIUrl":null,"url":null,"abstract":"<div>\n \n Microservice technology, as a flexible application architecture, has gained wide popularity in the field of Internet of Things (IoT). IoT applications are highly sensitive to latency, making it crucial to place microservices on appropriate edge servers in an edge computing environment. Failure to do so can significantly impact service quality and degrade user experience, posing a major challenge. Addressing the aforementioned issues, this paper proposes a multiobjective service deployment strategy for IoT devices based on reinforcement learning. The goal is to minimize service access delay for IoT devices and reduce the average energy consumption of edge servers in the context of mobile edge computing. To achieve this, we first establish a stochastic optimization model using the Markov decision process (MDP) framework to handle service deployment and resource allocation dynamically. This model captures key characteristics such as heterogeneity in edge server capabilities, dynamic geographic information of IoT devices, and uncertainty in microservice requests. To overcome challenges related to dimensionality, slow convergence, and the exploration–exploitation tradeoff in traditional reinforcement learning algorithms, we introduce deep reinforcement learning into the optimization of microservice deployment. Specifically, we propose the use of deep deterministic policy gradient (DDPG) to obtain a near-optimal service deployment strategy without manual instructions. DDPG leverages the depth of the network to guide policy gradients and generate solutions that effectively balance exploration and exploitation. To evaluate the proposed approach, we implement the DPG-MSP (DDPG-based MicroService Placement) algorithm using real datasets and synthetic data. Comparative analysis with existing microservice deployment algorithms demonstrates the superiority of DDPG-MSP in terms of performance, robustness, and scalability.\n </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 15-17","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Energy-Latency Tradeoffs for Service Placement Based on Reinforcement Learning in Edge Computing\",\"authors\":\"Bing Tang, Haiyan Li, Wei Xu, Buqing Cao, Qing Yang\",\"doi\":\"10.1002/cpe.70154\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n Microservice technology, as a flexible application architecture, has gained wide popularity in the field of Internet of Things (IoT). IoT applications are highly sensitive to latency, making it crucial to place microservices on appropriate edge servers in an edge computing environment. Failure to do so can significantly impact service quality and degrade user experience, posing a major challenge. Addressing the aforementioned issues, this paper proposes a multiobjective service deployment strategy for IoT devices based on reinforcement learning. The goal is to minimize service access delay for IoT devices and reduce the average energy consumption of edge servers in the context of mobile edge computing. To achieve this, we first establish a stochastic optimization model using the Markov decision process (MDP) framework to handle service deployment and resource allocation dynamically. This model captures key characteristics such as heterogeneity in edge server capabilities, dynamic geographic information of IoT devices, and uncertainty in microservice requests. To overcome challenges related to dimensionality, slow convergence, and the exploration–exploitation tradeoff in traditional reinforcement learning algorithms, we introduce deep reinforcement learning into the optimization of microservice deployment. Specifically, we propose the use of deep deterministic policy gradient (DDPG) to obtain a near-optimal service deployment strategy without manual instructions. DDPG leverages the depth of the network to guide policy gradients and generate solutions that effectively balance exploration and exploitation. To evaluate the proposed approach, we implement the DPG-MSP (DDPG-based MicroService Placement) algorithm using real datasets and synthetic data. Comparative analysis with existing microservice deployment algorithms demonstrates the superiority of DDPG-MSP in terms of performance, robustness, and scalability.\\n </div>\",\"PeriodicalId\":55214,\"journal\":{\"name\":\"Concurrency and Computation-Practice & Experience\",\"volume\":\"37 15-17\",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2025-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Concurrency and Computation-Practice & Experience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cpe.70154\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation-Practice & Experience","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpe.70154","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

微服务技术作为一种灵活的应用架构，在物联网领域得到了广泛的应用。物联网应用程序对延迟高度敏感，因此在边缘计算环境中将微服务放置在适当的边缘服务器上至关重要。如果做不到这一点，可能会严重影响服务质量，降低用户体验，从而构成重大挑战。针对上述问题，本文提出了一种基于强化学习的物联网设备多目标服务部署策略。目标是最大限度地减少物联网设备的服务访问延迟，并降低移动边缘计算环境下边缘服务器的平均能耗。为了实现这一目标，我们首先使用马尔可夫决策过程（MDP）框架建立了一个随机优化模型来动态处理服务部署和资源分配。该模型捕获了关键特征，如边缘服务器功能的异构性、物联网设备的动态地理信息以及微服务请求的不确定性。为了克服传统强化学习算法中与维度、缓慢收敛和探索-利用权衡相关的挑战，我们将深度强化学习引入到微服务部署的优化中。具体来说，我们建议使用深度确定性策略梯度（DDPG）来获得接近最优的服务部署策略，而无需手动指导。DDPG利用网络的深度来指导政策梯度，并产生有效平衡勘探和开采的解决方案。为了评估所提出的方法，我们使用真实数据集和合成数据实现了DPG-MSP（基于ddpg的微服务放置）算法。与现有微服务部署算法的对比分析表明，DDPG-MSP在性能、鲁棒性和可扩展性方面具有优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Energy-Latency Tradeoffs for Service Placement Based on Reinforcement Learning in Edge Computing

Microservice technology, as a flexible application architecture, has gained wide popularity in the field of Internet of Things (IoT). IoT applications are highly sensitive to latency, making it crucial to place microservices on appropriate edge servers in an edge computing environment. Failure to do so can significantly impact service quality and degrade user experience, posing a major challenge. Addressing the aforementioned issues, this paper proposes a multiobjective service deployment strategy for IoT devices based on reinforcement learning. The goal is to minimize service access delay for IoT devices and reduce the average energy consumption of edge servers in the context of mobile edge computing. To achieve this, we first establish a stochastic optimization model using the Markov decision process (MDP) framework to handle service deployment and resource allocation dynamically. This model captures key characteristics such as heterogeneity in edge server capabilities, dynamic geographic information of IoT devices, and uncertainty in microservice requests. To overcome challenges related to dimensionality, slow convergence, and the exploration–exploitation tradeoff in traditional reinforcement learning algorithms, we introduce deep reinforcement learning into the optimization of microservice deployment. Specifically, we propose the use of deep deterministic policy gradient (DDPG) to obtain a near-optimal service deployment strategy without manual instructions. DDPG leverages the depth of the network to guide policy gradients and generate solutions that effectively balance exploration and exploitation. To evaluate the proposed approach, we implement the DPG-MSP (DDPG-based MicroService Placement) algorithm using real datasets and synthetic data. Comparative analysis with existing microservice deployment algorithms demonstrates the superiority of DDPG-MSP in terms of performance, robustness, and scalability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Concurrency and Computation-Practice & Experience 工程技术-计算机：理论方法

CiteScore

5.00

自引率

10.00%

发文量

664

审稿时长

9.6 months

期刊介绍： Concurrency and Computation: Practice and Experience (CCPE) publishes high-quality, original research papers, and authoritative research review papers, in the overlapping fields of: Parallel and distributed computing; High-performance computing; Computational and data science; Artificial intelligence and machine learning; Big data applications, algorithms, and systems; Network science; Ontologies and semantics; Security and privacy; Cloud/edge/fog computing; Green computing; and Quantum computing.