Cache Allocation in Multi-Tenant Edge Computing: An Online Model-Based Reinforcement Learning Approach

IF 5 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Cloud Computing Pub Date : 2025-02-04 DOI:10.1109/TCC.2025.3538158

Ayoub Ben-Ameur;Andrea Araldo;Tijani Chahed;György Dán

{"title":"Cache Allocation in Multi-Tenant Edge Computing: An Online Model-Based Reinforcement Learning Approach","authors":"Ayoub Ben-Ameur;Andrea Araldo;Tijani Chahed;György Dán","doi":"10.1109/TCC.2025.3538158","DOIUrl":null,"url":null,"abstract":"We consider a Network Operator (NO) that owns Edge Computing (EC) resources, virtualizes them and lets third party Service Providers (SPs) run their services, using the allocated slice of resources. We focus on one specific resource, i.e., cache space, and on the problem of how to allocate it among several SPs in order to minimize the backhaul traffic. Due to confidentiality guarantees, the NO cannot observe the nature of the traffic of SPs, which is encrypted. Allocation decisions are thus challenging, since they must be taken solely based on observed monitoring information. Another challenge is that not all the traffic is cacheable. We propose a data-driven cache allocation strategy, based on Reinforcement Learning (RL). Unlike most RL applications, in which the decision policy is learned offline on a simulator, we assume no previous knowledge is available to build such a simulator. We thus apply RL in an <italic>online</i> fashion, i.e., the model and the policy are learned by directly perturbing and monitoring the actual system. Since perturbations generate spurious traffic, we thus need to limit perturbations. This requires learning to be extremely efficient. To this aim, we devise a strategy that learns an approximation of the cost function, while interacting with the system. We then use such an approximation in a Model-Based RL (MB-RL) to speed up convergence. We prove analytically that our strategy brings cache allocation boundedly close to the optimum and stably remains in such an allocation. We show in simulations that such convergence is obtained within few minutes. We also study its fairness, its sensitivity to several scenario characteristics and compare it with a method from the state-of-the-art.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 2","pages":"459-472"},"PeriodicalIF":5.0000,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cloud Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10870410/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

We consider a Network Operator (NO) that owns Edge Computing (EC) resources, virtualizes them and lets third party Service Providers (SPs) run their services, using the allocated slice of resources. We focus on one specific resource, i.e., cache space, and on the problem of how to allocate it among several SPs in order to minimize the backhaul traffic. Due to confidentiality guarantees, the NO cannot observe the nature of the traffic of SPs, which is encrypted. Allocation decisions are thus challenging, since they must be taken solely based on observed monitoring information. Another challenge is that not all the traffic is cacheable. We propose a data-driven cache allocation strategy, based on Reinforcement Learning (RL). Unlike most RL applications, in which the decision policy is learned offline on a simulator, we assume no previous knowledge is available to build such a simulator. We thus apply RL in an online fashion, i.e., the model and the policy are learned by directly perturbing and monitoring the actual system. Since perturbations generate spurious traffic, we thus need to limit perturbations. This requires learning to be extremely efficient. To this aim, we devise a strategy that learns an approximation of the cost function, while interacting with the system. We then use such an approximation in a Model-Based RL (MB-RL) to speed up convergence. We prove analytically that our strategy brings cache allocation boundedly close to the optimum and stably remains in such an allocation. We show in simulations that such convergence is obtained within few minutes. We also study its fairness, its sensitivity to several scenario characteristics and compare it with a method from the state-of-the-art.

查看原文本刊更多论文

多租户边缘计算中的缓存分配：一种基于在线模型的强化学习方法

我们考虑拥有边缘计算（EC）资源的网络运营商（NO），将其虚拟化，并允许第三方服务提供商（sp）使用分配的资源片运行其服务。我们专注于一个特定的资源，即缓存空间，以及如何在几个sp之间分配它以最小化回程流量的问题。由于存在机密性保证，因此NO无法观察到被加密的sp的流量性质。因此，分配决策具有挑战性，因为它们必须完全基于观察到的监测信息。另一个挑战是，并非所有的流量都是可缓存的。我们提出了一种基于强化学习（RL）的数据驱动缓存分配策略。与大多数强化学习应用程序（其中决策策略是在模拟器上脱机学习的）不同，我们假设没有可用的先前知识来构建这样的模拟器。因此，我们以在线方式应用强化学习，即通过直接干扰和监控实际系统来学习模型和策略。由于扰动产生虚假流量，因此我们需要限制扰动。这就要求学习的效率极高。为此，我们设计了一种策略，在与系统交互的同时学习成本函数的近似值。然后，我们在基于模型的RL （MB-RL）中使用这种近似来加速收敛。通过分析证明，该策略使缓存分配有界地接近于最优，并稳定地保持在这种分配状态。我们在模拟中证明了这种收敛在几分钟内得到。我们还研究了它的公平性，它对几个场景特征的敏感性，并将其与最先进的方法进行了比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Cloud Computing Computer Science-Software

CiteScore

9.40

自引率

6.20%

发文量

167

期刊介绍： The IEEE Transactions on Cloud Computing (TCC) is dedicated to the multidisciplinary field of cloud computing. It is committed to the publication of articles that present innovative research ideas, application results, and case studies in cloud computing, focusing on key technical issues related to theory, algorithms, systems, applications, and performance.