A Novel Adaptive $360^{\circ }$360∘ Livestreaming With Graph Representation Learning Based FoV Prediction

IF 5.4 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Emerging Topics in Computing Pub Date : 2024-08-09 DOI:10.1109/TETC.2024.3435002

Xingyan Chen;Huaming Du;Mu Wang;Yu Zhao;Xiaoyang Shu;Changqiao Xu;Gabriel-Miro Muntean

{"title":"A Novel Adaptive $360^{\\circ }$360∘ Livestreaming With Graph Representation Learning Based FoV Prediction","authors":"Xingyan Chen;Huaming Du;Mu Wang;Yu Zhao;Xiaoyang Shu;Changqiao Xu;Gabriel-Miro Muntean","doi":"10.1109/TETC.2024.3435002","DOIUrl":null,"url":null,"abstract":"The exceptionally high bandwidth requirements associated with the delivery of live <inline-formula><tex-math>$360^{\\circ }$</tex-math></inline-formula> video content pose significant challenges in the current network context. An avenue for addressing this bandwidth challenge is to use the limited network resources for sending the user's Field-of-View (FoV) tiles at a high resolution, instead of transmitting all frame components at high quality. However, precisely forecasting the FoV for <inline-formula><tex-math>$360^{\\circ }$</tex-math></inline-formula> live video content distribution remains a complex endeavor due to the lack of pre-knowledge on user viewing behaviors. In this paper, we present GL360, a novel <inline-formula><tex-math>$360^{\\circ }$</tex-math></inline-formula> transmission framework, which employs Graph Representation Learning for FoV prediction. First, we analyze the interaction between users and tiles in panoramic videos utilizing a dynamic heterogeneous Relational Graph Convolutional Network (RGCN), which facilitates efficient user and tile embedding representation learning. Second, we propose an online dynamic heterogeneous graph learning (DHGL)-based algorithm to dynamically capture the time-varying features of the user's viewing behaviors with limited prior knowledge. Further, we design a FoV-aware content delivery algorithm that allows the edge servers to determine the video tiles’ resolution for each accessed user. Experimental results based on real traces demonstrate how our solution outperforms four other solutions in terms of FoV prediction and network performance.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 2","pages":"537-550"},"PeriodicalIF":5.4000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10633261/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

The exceptionally high bandwidth requirements associated with the delivery of live

$360^{\circ }$

video content pose significant challenges in the current network context. An avenue for addressing this bandwidth challenge is to use the limited network resources for sending the user's Field-of-View (FoV) tiles at a high resolution, instead of transmitting all frame components at high quality. However, precisely forecasting the FoV for

$360^{\circ }$

live video content distribution remains a complex endeavor due to the lack of pre-knowledge on user viewing behaviors. In this paper, we present GL360, a novel

$360^{\circ }$

transmission framework, which employs Graph Representation Learning for FoV prediction. First, we analyze the interaction between users and tiles in panoramic videos utilizing a dynamic heterogeneous Relational Graph Convolutional Network (RGCN), which facilitates efficient user and tile embedding representation learning. Second, we propose an online dynamic heterogeneous graph learning (DHGL)-based algorithm to dynamically capture the time-varying features of the user's viewing behaviors with limited prior knowledge. Further, we design a FoV-aware content delivery algorithm that allows the edge servers to determine the video tiles’ resolution for each accessed user. Experimental results based on real traces demonstrate how our solution outperforms four other solutions in terms of FoV prediction and network performance.

查看原文本刊更多论文

利用基于图表示学习的 FoV 预测进行 360° 自适应直播的新方法

在当前的网络环境下，与360美元视频内容直播相关的极高带宽要求构成了重大挑战。解决这一带宽挑战的一个途径是利用有限的网络资源以高分辨率发送用户的视场（FoV）图块，而不是以高质量传输所有帧组件。然而，由于缺乏对用户观看行为的预先了解，精确预测360美元直播视频内容分发的FoV仍然是一项复杂的工作。在本文中，我们提出了一种新的$360^{\circ}$传输框架GL360，它采用图表示学习进行视场预测。首先，我们利用动态异构关系图卷积网络（RGCN）分析全景视频中用户和贴图之间的交互，该网络促进了高效的用户和贴图嵌入表示学习。其次，我们提出了一种基于在线动态异构图学习（DHGL）的算法，在有限的先验知识下动态捕获用户观看行为的时变特征。此外，我们设计了一个感知fov的内容传递算法，允许边缘服务器为每个访问的用户确定视频块的分辨率。基于真实轨迹的实验结果表明，我们的解决方案在视场预测和网络性能方面优于其他四种解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Emerging Topics in Computing Computer Science-Computer Science (miscellaneous)

CiteScore

12.10

自引率

5.10%

发文量

113

期刊介绍： IEEE Transactions on Emerging Topics in Computing publishes papers on emerging aspects of computer science, computing technology, and computing applications not currently covered by other IEEE Computer Society Transactions. Some examples of emerging topics in computing include: IT for Green, Synthetic and organic computing structures and systems, Advanced analytics, Social/occupational computing, Location-based/client computer systems, Morphic computer design, Electronic game systems, & Health-care IT.