Enhancing fog load balancing through lifelong transfer learning of reinforcement learning agents

IF 4.5 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Computer Communications Pub Date : 2025-02-01 DOI:10.1016/j.comcom.2024.108024

Maad Ebrahim , Abdelhakim Hafid , Mohamed Riduan Abid

{"title":"Enhancing fog load balancing through lifelong transfer learning of reinforcement learning agents","authors":"Maad Ebrahim , Abdelhakim Hafid , Mohamed Riduan Abid","doi":"10.1016/j.comcom.2024.108024","DOIUrl":null,"url":null,"abstract":"<div><div>Fog computing is a promising paradigm for processing Internet of Things (IoT) data. Load balancing (LB) optimizes Fog performance through efficient resource allocation, improving resource utilization, latency for real-time IoT applications, and users’ quality of service. In this work, we enhance the learning process of privacy-aware Reinforcement Learning (PARL), which requires significant training to minimize waiting delays by reducing the number of queued requests without explicitly observing Fog load or resource capabilities. To achieve this, we explore different Transfer Learning (TL) techniques for efficient adaptation to variations in demand, triggering a fine-tuning process when abrupt surges in generation rates are detected. This exploration highlights the advantages and disadvantages of reusing previously learned policies (knowledge) and interactions (experience) over multiple learning epochs with increased difficulties. Our results show that Full TL (using knowledge and experience) enhances the learning and generalization of the PARL agent, allowing it to consistently converge to the optimal solution with 80% less training compared to without TL. Additionally, we propose a lifelong learning framework for practical agent deployment in frequently changing environments. Introducing TL in this framework significantly reduces the computationally expensive training phase compared to training from scratch. Instead of continuous adaptation through ongoing training, balancer resources are preserved to provide faster decisions via a lightweight inference model. In case of significant system changes, the model is swiftly fine-tuned using TL. Furthermore, the framework leverages existing (expert) or simulation-trained agents to initialize newly deployed agents in the network, reducing failure probability in new environments compared to learning from scratch. To our knowledge, no existing efforts in the literature use TL to address lifelong learning for practical RL-based Fog LB. This gap highlights the need for a practical yet efficient solution that minimizes the cost of continuous adaptation to changing conditions.</div></div>","PeriodicalId":55224,"journal":{"name":"Computer Communications","volume":"231 ","pages":"Article 108024"},"PeriodicalIF":4.5000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Communications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0140366424003712","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Fog computing is a promising paradigm for processing Internet of Things (IoT) data. Load balancing (LB) optimizes Fog performance through efficient resource allocation, improving resource utilization, latency for real-time IoT applications, and users’ quality of service. In this work, we enhance the learning process of privacy-aware Reinforcement Learning (PARL), which requires significant training to minimize waiting delays by reducing the number of queued requests without explicitly observing Fog load or resource capabilities. To achieve this, we explore different Transfer Learning (TL) techniques for efficient adaptation to variations in demand, triggering a fine-tuning process when abrupt surges in generation rates are detected. This exploration highlights the advantages and disadvantages of reusing previously learned policies (knowledge) and interactions (experience) over multiple learning epochs with increased difficulties. Our results show that Full TL (using knowledge and experience) enhances the learning and generalization of the PARL agent, allowing it to consistently converge to the optimal solution with 80% less training compared to without TL. Additionally, we propose a lifelong learning framework for practical agent deployment in frequently changing environments. Introducing TL in this framework significantly reduces the computationally expensive training phase compared to training from scratch. Instead of continuous adaptation through ongoing training, balancer resources are preserved to provide faster decisions via a lightweight inference model. In case of significant system changes, the model is swiftly fine-tuned using TL. Furthermore, the framework leverages existing (expert) or simulation-trained agents to initialize newly deployed agents in the network, reducing failure probability in new environments compared to learning from scratch. To our knowledge, no existing efforts in the literature use TL to address lifelong learning for practical RL-based Fog LB. This gap highlights the need for a practical yet efficient solution that minimizes the cost of continuous adaptation to changing conditions.

查看原文本刊更多论文

通过强化学习代理的终身迁移学习增强雾负载平衡

雾计算是处理物联网（IoT）数据的一个很有前途的范例。负载均衡（Load balancing， LB）通过有效的资源分配，提高资源利用率，降低物联网实时应用的时延，提高用户的服务质量，从而优化Fog的性能。在这项工作中，我们增强了隐私感知强化学习（PARL）的学习过程，这需要大量的训练，以通过减少排队请求的数量来最小化等待延迟，而无需明确观察雾负载或资源能力。为了实现这一目标，我们探索了不同的迁移学习（TL）技术，以有效地适应需求的变化，当检测到发电速率突然激增时触发微调过程。这一探索突出了在多个学习时代中重用先前学习过的策略（知识）和交互（经验）的优点和缺点，并且难度越来越大。我们的研究结果表明，Full TL（使用知识和经验）增强了PARL智能体的学习和泛化能力，使其能够始终收敛到最优解，与没有TL相比，训练减少了80%。此外，我们提出了一个终身学习框架，用于在频繁变化的环境中部署实际的智能体。与从头开始训练相比，在该框架中引入TL显著减少了计算开销较大的训练阶段。平衡器资源被保留下来，通过轻量级推理模型提供更快的决策，而不是通过持续的训练进行持续的适应。在系统发生重大变化的情况下，使用TL快速微调模型。此外，该框架利用现有（专家）或模拟训练的代理来初始化网络中新部署的代理，与从头开始学习相比，减少了新环境中的故障概率。据我们所知，目前文献中还没有使用TL来解决实际基于rl的Fog LB的终身学习问题。这一差距突出表明，需要一种实用而有效的解决方案，以最大限度地降低持续适应不断变化的条件的成本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Communications 工程技术-电信学

CiteScore

14.10

自引率

5.00%

发文量

397

审稿时长

66 days

期刊介绍： Computer and Communications networks are key infrastructures of the information society with high socio-economic value as they contribute to the correct operations of many critical services (from healthcare to finance and transportation). Internet is the core of today''s computer-communication infrastructures. This has transformed the Internet, from a robust network for data transfer between computers, to a global, content-rich, communication and information system where contents are increasingly generated by the users, and distributed according to human social relations. Next-generation network technologies, architectures and protocols are therefore required to overcome the limitations of the legacy Internet and add new capabilities and services. The future Internet should be ubiquitous, secure, resilient, and closer to human communication paradigms. Computer Communications is a peer-reviewed international journal that publishes high-quality scientific articles (both theory and practice) and survey papers covering all aspects of future computer communication networks (on all layers, except the physical layer), with a special attention to the evolution of the Internet architecture, protocols, services, and applications.