ReinFog: A Deep Reinforcement Learning empowered framework for resource management in edge and cloud computing environments

IF 8 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Network and Computer Applications Pub Date : 2025-06-19 DOI:10.1016/j.jnca.2025.104250

Zhiyu Wang , Mohammad Goudarzi , Rajkumar Buyya

{"title":"ReinFog: A Deep Reinforcement Learning empowered framework for resource management in edge and cloud computing environments","authors":"Zhiyu Wang , Mohammad Goudarzi , Rajkumar Buyya","doi":"10.1016/j.jnca.2025.104250","DOIUrl":null,"url":null,"abstract":"<div><div>The growing IoT landscape requires effective server deployment strategies to meet demands including real-time processing and energy efficiency. This is complicated by heterogeneous, dynamic applications and servers. To address these challenges, we propose ReinFog, a modular distributed software empowered with Deep Reinforcement Learning (DRL) for adaptive resource management across edge/fog and cloud environments. ReinFog enables the practical development/deployment of various centralized and distributed DRL techniques for resource management in edge/fog and cloud computing environments. It also supports integrating native and library-based DRL techniques for diverse IoT application scheduling objectives. Additionally, ReinFog allows for customizing deployment configurations for different DRL techniques, including the number and placement of DRL Learners and DRL Workers in large-scale distributed systems. Besides, we propose a novel Memetic Algorithm for DRL Component (e.g., DRL Learners and DRL Workers) Placement in ReinFog named MADCP, which combines the strengths of Genetic Algorithm, Firefly Algorithm, and Particle Swarm Optimization. Experiments reveal that the DRL mechanisms developed within ReinFog have significantly enhanced both centralized and distributed DRL techniques implementation. These advancements have resulted in notable improvements in IoT application performance, reducing response time by 45%, energy consumption by 39%, and weighted cost by 37%, while maintaining minimal scheduling overhead. Additionally, ReinFog exhibits remarkable scalability, with a rise in DRL Workers from 1 to 30 causing only a 0.3-second increase in startup time and around 2 MB more RAM per Worker. The proposed MADCP for DRL component placement further accelerates the convergence rate of DRL techniques by up to 38%.</div></div>","PeriodicalId":54784,"journal":{"name":"Journal of Network and Computer Applications","volume":"242 ","pages":"Article 104250"},"PeriodicalIF":8.0000,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Network and Computer Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S108480452500147X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

The growing IoT landscape requires effective server deployment strategies to meet demands including real-time processing and energy efficiency. This is complicated by heterogeneous, dynamic applications and servers. To address these challenges, we propose ReinFog, a modular distributed software empowered with Deep Reinforcement Learning (DRL) for adaptive resource management across edge/fog and cloud environments. ReinFog enables the practical development/deployment of various centralized and distributed DRL techniques for resource management in edge/fog and cloud computing environments. It also supports integrating native and library-based DRL techniques for diverse IoT application scheduling objectives. Additionally, ReinFog allows for customizing deployment configurations for different DRL techniques, including the number and placement of DRL Learners and DRL Workers in large-scale distributed systems. Besides, we propose a novel Memetic Algorithm for DRL Component (e.g., DRL Learners and DRL Workers) Placement in ReinFog named MADCP, which combines the strengths of Genetic Algorithm, Firefly Algorithm, and Particle Swarm Optimization. Experiments reveal that the DRL mechanisms developed within ReinFog have significantly enhanced both centralized and distributed DRL techniques implementation. These advancements have resulted in notable improvements in IoT application performance, reducing response time by 45%, energy consumption by 39%, and weighted cost by 37%, while maintaining minimal scheduling overhead. Additionally, ReinFog exhibits remarkable scalability, with a rise in DRL Workers from 1 to 30 causing only a 0.3-second increase in startup time and around 2 MB more RAM per Worker. The proposed MADCP for DRL component placement further accelerates the convergence rate of DRL techniques by up to 38%.

查看原文本刊更多论文

ReinFog：用于边缘和云计算环境中资源管理的深度强化学习授权框架

不断增长的物联网领域需要有效的服务器部署策略来满足包括实时处理和能源效率在内的需求。异构的、动态的应用程序和服务器使情况变得复杂。为了应对这些挑战，我们提出了ReinFog，这是一个模块化的分布式软件，具有深度强化学习（DRL）功能，用于跨边缘/雾和云环境的自适应资源管理。ReinFog支持各种集中式和分布式DRL技术的实际开发/部署，用于边缘/雾和云计算环境中的资源管理。它还支持集成本地和基于库的DRL技术，以实现不同的物联网应用程序调度目标。此外，ReinFog允许为不同的DRL技术定制部署配置，包括大规模分布式系统中DRL学习者和DRL工作者的数量和位置。此外，我们在ReinFog中提出了一种新的模因算法MADCP，用于DRL组件（例如DRL学习者和DRL工作者）的放置，该算法结合了遗传算法、萤火虫算法和粒子群算法的优点。实验表明，在ReinFog中开发的DRL机制显著增强了集中式和分布式DRL技术的实现。这些进步显著提高了物联网应用的性能，响应时间减少了45%，能耗减少了39%，加权成本减少了37%，同时保持了最小的调度开销。此外，ReinFog还展示了出色的可扩展性，DRL Worker从1增加到30只会导致启动时间增加0.3秒，每个Worker增加约2 MB的RAM。提出的DRL组件放置MADCP进一步加快了DRL技术的收敛速度，最高可达38%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Network and Computer Applications 工程技术-计算机：跨学科应用

CiteScore

21.50

自引率

3.40%

发文量

142

审稿时长

37 days

期刊介绍： The Journal of Network and Computer Applications welcomes research contributions, surveys, and notes in all areas relating to computer networks and applications thereof. Sample topics include new design techniques, interesting or novel applications, components or standards; computer networks with tools such as WWW; emerging standards for internet protocols; Wireless networks; Mobile Computing; emerging computing models such as cloud computing, grid computing; applications of networked systems for remote collaboration and telemedicine, etc. The journal is abstracted and indexed in Scopus, Engineering Index, Web of Science, Science Citation Index Expanded and INSPEC.