{"title":"VR中优化的多用户全景视频传输:一种机器学习驱动的方法","authors":"Wei Xun, Songlin Zhang","doi":"10.1002/cav.70060","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>In this paper, we propose a machine learning-driven model to optimize panoramic video transmission for multiple users in virtual reality environments. The model predicts users' future field of view (FOV) using historical head orientation data and video saliency information, enabling targeted video delivery based on individual perspectives. By segmenting panoramic videos into tiles and applying a pyramid coding scheme, we adaptively transmit high-quality content within users' FOVs while utilizing lower-quality transmissions for peripheral regions. This approach effectively reduces bandwidth consumption while maintaining a high-quality viewing experience. Our experimental results demonstrate that combining user viewpoint data with video saliency features significantly improves long-term FOV prediction accuracy, leading to a more efficient and user-centric transmission model. The proposed method holds great potential for enhancing the immersive experience of panoramic video streaming in VR, particularly in bandwidth-constrained environments.</p>\n </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimized Multiuser Panoramic Video Transmission in VR: A Machine Learning-Driven Approach\",\"authors\":\"Wei Xun, Songlin Zhang\",\"doi\":\"10.1002/cav.70060\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>In this paper, we propose a machine learning-driven model to optimize panoramic video transmission for multiple users in virtual reality environments. The model predicts users' future field of view (FOV) using historical head orientation data and video saliency information, enabling targeted video delivery based on individual perspectives. By segmenting panoramic videos into tiles and applying a pyramid coding scheme, we adaptively transmit high-quality content within users' FOVs while utilizing lower-quality transmissions for peripheral regions. This approach effectively reduces bandwidth consumption while maintaining a high-quality viewing experience. Our experimental results demonstrate that combining user viewpoint data with video saliency features significantly improves long-term FOV prediction accuracy, leading to a more efficient and user-centric transmission model. The proposed method holds great potential for enhancing the immersive experience of panoramic video streaming in VR, particularly in bandwidth-constrained environments.</p>\\n </div>\",\"PeriodicalId\":50645,\"journal\":{\"name\":\"Computer Animation and Virtual Worlds\",\"volume\":\"36 3\",\"pages\":\"\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2025-06-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Animation and Virtual Worlds\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cav.70060\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Animation and Virtual Worlds","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cav.70060","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
Optimized Multiuser Panoramic Video Transmission in VR: A Machine Learning-Driven Approach
In this paper, we propose a machine learning-driven model to optimize panoramic video transmission for multiple users in virtual reality environments. The model predicts users' future field of view (FOV) using historical head orientation data and video saliency information, enabling targeted video delivery based on individual perspectives. By segmenting panoramic videos into tiles and applying a pyramid coding scheme, we adaptively transmit high-quality content within users' FOVs while utilizing lower-quality transmissions for peripheral regions. This approach effectively reduces bandwidth consumption while maintaining a high-quality viewing experience. Our experimental results demonstrate that combining user viewpoint data with video saliency features significantly improves long-term FOV prediction accuracy, leading to a more efficient and user-centric transmission model. The proposed method holds great potential for enhancing the immersive experience of panoramic video streaming in VR, particularly in bandwidth-constrained environments.
期刊介绍:
With the advent of very powerful PCs and high-end graphics cards, there has been an incredible development in Virtual Worlds, real-time computer animation and simulation, games. But at the same time, new and cheaper Virtual Reality devices have appeared allowing an interaction with these real-time Virtual Worlds and even with real worlds through Augmented Reality. Three-dimensional characters, especially Virtual Humans are now of an exceptional quality, which allows to use them in the movie industry. But this is only a beginning, as with the development of Artificial Intelligence and Agent technology, these characters will become more and more autonomous and even intelligent. They will inhabit the Virtual Worlds in a Virtual Life together with animals and plants.