使用深度强化学习（DRL）优化360度视频贴图管理的质量

IF 4.8 1区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Broadcasting Pub Date : 2025-03-11 DOI:10.1109/TBC.2025.3541860

Chunguang Li;Dayoung Lee;Minseok Song

{"title":"使用深度强化学习（DRL）优化360度视频贴图管理的质量","authors":"Chunguang Li;Dayoung Lee;Minseok Song","doi":"10.1109/TBC.2025.3541860","DOIUrl":null,"url":null,"abstract":"360-degree videos inherently require significant storage space because each segment consists of many tiles, each of which is further transcoded and stored in multiple versions. It is thus impractical to store all transcoded versions, which makes it essential to make effective use of limited storage space. However, the inefficiency of existing heuristic-based management schemes arises from the challenge of incorporating various factors, such as variable bandwidth requirements influenced by network conditions, tile access distribution, and video quality dependent on content. To address this, we propose a new storage space management scheme, which combines the dueling deep Q-network (DQN) algorithm based on the field-of-view (FoV) distribution and the greedy algorithm that considers the overall video popularity. We first model an environment in which the agent can determine the versions for each tile to achieve the best video quality under various storage limit conditions. The dueling DQN environment comprises 1) an action space determining version combinations for each tile within specified storage limits, 2) an observation space enabling the agent to learn variable bandwidths and tile access distributions, and 3) a reward model deriving the expected video quality for different actions. Building upon the dueling DQN model correlating storage limits with expected video quality, we present a greedy algorithm that selects versions among multiple videos within storage limits for the purpose of maximizing popularity-weighted video quality. Extensive simulations evaluated the proposed scheme under various storage limits, bandwidth changes, and FoV distributions, demonstrating an improvement in overall popularity-weighted video quality ranging from 0.49% to 37.77% (with an average improvement of 13.96%) compared to existing benchmark schemes.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"71 2","pages":"555-569"},"PeriodicalIF":4.8000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using Deep Reinforcement Learning (DRL) to Optimize Quality in 360-Degree Video Tile Management\",\"authors\":\"Chunguang Li;Dayoung Lee;Minseok Song\",\"doi\":\"10.1109/TBC.2025.3541860\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"360-degree videos inherently require significant storage space because each segment consists of many tiles, each of which is further transcoded and stored in multiple versions. It is thus impractical to store all transcoded versions, which makes it essential to make effective use of limited storage space. However, the inefficiency of existing heuristic-based management schemes arises from the challenge of incorporating various factors, such as variable bandwidth requirements influenced by network conditions, tile access distribution, and video quality dependent on content. To address this, we propose a new storage space management scheme, which combines the dueling deep Q-network (DQN) algorithm based on the field-of-view (FoV) distribution and the greedy algorithm that considers the overall video popularity. We first model an environment in which the agent can determine the versions for each tile to achieve the best video quality under various storage limit conditions. The dueling DQN environment comprises 1) an action space determining version combinations for each tile within specified storage limits, 2) an observation space enabling the agent to learn variable bandwidths and tile access distributions, and 3) a reward model deriving the expected video quality for different actions. Building upon the dueling DQN model correlating storage limits with expected video quality, we present a greedy algorithm that selects versions among multiple videos within storage limits for the purpose of maximizing popularity-weighted video quality. Extensive simulations evaluated the proposed scheme under various storage limits, bandwidth changes, and FoV distributions, demonstrating an improvement in overall popularity-weighted video quality ranging from 0.49% to 37.77% (with an average improvement of 13.96%) compared to existing benchmark schemes.\",\"PeriodicalId\":13159,\"journal\":{\"name\":\"IEEE Transactions on Broadcasting\",\"volume\":\"71 2\",\"pages\":\"555-569\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2025-03-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Broadcasting\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10922854/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Broadcasting","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10922854/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

360度视频本质上需要大量的存储空间，因为每个片段由许多块组成，每个块进一步转编码并存储为多个版本。因此，存储所有转码版本是不切实际的，因此必须有效利用有限的存储空间。然而，现有的基于启发式的管理方案的低效率源于整合各种因素的挑战，例如受网络条件影响的可变带宽需求、tile访问分布和依赖于内容的视频质量。为了解决这个问题，我们提出了一种新的存储空间管理方案，该方案结合了基于视场分布的决斗深度q -网络（DQN）算法和考虑整体视频流行度的贪婪算法。我们首先建立了一个环境模型，在这个环境中，智能体可以确定每个贴图的版本，以在各种存储限制条件下获得最佳的视频质量。决斗DQN环境包括：1)决定在指定存储限制内每个贴图版本组合的动作空间，2)使智能体能够学习可变带宽和贴图访问分布的观察空间，以及3)派生不同动作的预期视频质量的奖励模型。在将存储限制与期望视频质量相关联的决斗DQN模型的基础上，我们提出了一种贪婪算法，该算法在存储限制内的多个视频中选择版本，以最大化流行加权视频质量。大量的模拟评估了在各种存储限制、带宽变化和视场分布下提出的方案，表明与现有基准方案相比，总体流行加权视频质量的改善范围从0.49%到37.77%（平均改善13.96%）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Using Deep Reinforcement Learning (DRL) to Optimize Quality in 360-Degree Video Tile Management

360-degree videos inherently require significant storage space because each segment consists of many tiles, each of which is further transcoded and stored in multiple versions. It is thus impractical to store all transcoded versions, which makes it essential to make effective use of limited storage space. However, the inefficiency of existing heuristic-based management schemes arises from the challenge of incorporating various factors, such as variable bandwidth requirements influenced by network conditions, tile access distribution, and video quality dependent on content. To address this, we propose a new storage space management scheme, which combines the dueling deep Q-network (DQN) algorithm based on the field-of-view (FoV) distribution and the greedy algorithm that considers the overall video popularity. We first model an environment in which the agent can determine the versions for each tile to achieve the best video quality under various storage limit conditions. The dueling DQN environment comprises 1) an action space determining version combinations for each tile within specified storage limits, 2) an observation space enabling the agent to learn variable bandwidths and tile access distributions, and 3) a reward model deriving the expected video quality for different actions. Building upon the dueling DQN model correlating storage limits with expected video quality, we present a greedy algorithm that selects versions among multiple videos within storage limits for the purpose of maximizing popularity-weighted video quality. Extensive simulations evaluated the proposed scheme under various storage limits, bandwidth changes, and FoV distributions, demonstrating an improvement in overall popularity-weighted video quality ranging from 0.49% to 37.77% (with an average improvement of 13.96%) compared to existing benchmark schemes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Broadcasting 工程技术-电信学

CiteScore

9.40

自引率

31.10%

发文量

审稿时长

6-12 weeks

期刊介绍： The Society’s Field of Interest is “Devices, equipment, techniques and systems related to broadcast technology, including the production, distribution, transmission, and propagation aspects.” In addition to this formal FOI statement, which is used to provide guidance to the Publications Committee in the selection of content, the AdCom has further resolved that “broadcast systems includes all aspects of transmission, propagation, and reception.”