Manel Gherari;Adyson Maia;Mouhamad Dieye;Halima Elbiaze;Yacine Ghamri-Doudane;Roch H. Glitho
{"title":"Optimizing Immersive Services With Parallel In-Network Rendering and Deep RL","authors":"Manel Gherari;Adyson Maia;Mouhamad Dieye;Halima Elbiaze;Yacine Ghamri-Doudane;Roch H. Glitho","doi":"10.1109/TMLCN.2026.3666742","DOIUrl":null,"url":null,"abstract":"This paper addresses the challenge of delivering low-latency, scalable immersive experiences by exploiting a hybrid continuum of cloud, edge, and In-Network Computing (INC) resources. Indeed, delivering low-latency, scalable immersive experiences requires the transfer of a large amount of digital assets of different sizes, many of them consisting of large, static scene elements corresponding to service-specific and user-specific components. We argue in this paper that such elements could be separated within an in-network rendering farm while dynamically caching popular assets and synchronizing rapidly changing, user-centric data at INC, Edge or Cloud nodes. Still all theses need to be orchestrated efficiently. To efficiently orchestrate these heterogeneous resources, we formulate in this paper a multi-objective optimization problem—maximizing resource efficiency, minimizing end-to-end latency, and maximizing user request acceptance. This optimization problem is then solved via a deep reinforcement learning (DRL) framework that adaptively assigns functions across all layers in real time. The purpose of our proposed popularity-based replication and pre-caching is to further reduce latency for the most frequently accessed assets, while we offload lightweight rendering operations directly onto programmable switches to cut down on round-trip delays. Extensive simulations, benchmarked against multiple baselines, demonstrate that our approach consistently maintains sub-20ms end-to-end delays and achieves superior resource utilization efficiency under dynamic workloads. These results validate the potential of integrating INC into the Compute Continuum and use a DRL-driven orchestration, both together allowing to meet the stringent Quality of Service (QoS) and Quality of Experience (QoE) requirements of next-generation immersive applications.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"4 ","pages":"491-513"},"PeriodicalIF":0.0000,"publicationDate":"2026-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11402906","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Machine Learning in Communications and Networking","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11402906/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper addresses the challenge of delivering low-latency, scalable immersive experiences by exploiting a hybrid continuum of cloud, edge, and In-Network Computing (INC) resources. Indeed, delivering low-latency, scalable immersive experiences requires the transfer of a large amount of digital assets of different sizes, many of them consisting of large, static scene elements corresponding to service-specific and user-specific components. We argue in this paper that such elements could be separated within an in-network rendering farm while dynamically caching popular assets and synchronizing rapidly changing, user-centric data at INC, Edge or Cloud nodes. Still all theses need to be orchestrated efficiently. To efficiently orchestrate these heterogeneous resources, we formulate in this paper a multi-objective optimization problem—maximizing resource efficiency, minimizing end-to-end latency, and maximizing user request acceptance. This optimization problem is then solved via a deep reinforcement learning (DRL) framework that adaptively assigns functions across all layers in real time. The purpose of our proposed popularity-based replication and pre-caching is to further reduce latency for the most frequently accessed assets, while we offload lightweight rendering operations directly onto programmable switches to cut down on round-trip delays. Extensive simulations, benchmarked against multiple baselines, demonstrate that our approach consistently maintains sub-20ms end-to-end delays and achieves superior resource utilization efficiency under dynamic workloads. These results validate the potential of integrating INC into the Compute Continuum and use a DRL-driven orchestration, both together allowing to meet the stringent Quality of Service (QoS) and Quality of Experience (QoE) requirements of next-generation immersive applications.