虚拟现实中渲染人类头像视频的主观和客观质量评估

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2024-10-02 DOI:10.1109/TIP.2024.3468881

Yu-Chih Chen;Avinab Saha;Alexandre Chapiro;Christian Häne;Jean-Charles Bazin;Bo Qiu;Stefano Zanetti;Ioannis Katsavounidis;Alan C. Bovik

{"title":"虚拟现实中渲染人类头像视频的主观和客观质量评估","authors":"Yu-Chih Chen;Avinab Saha;Alexandre Chapiro;Christian Häne;Jean-Charles Bazin;Bo Qiu;Stefano Zanetti;Ioannis Katsavounidis;Alan C. Bovik","doi":"10.1109/TIP.2024.3468881","DOIUrl":null,"url":null,"abstract":"We study the visual quality judgments of human subjects on digital human avatars (sometimes referred to as “holograms” in the parlance of virtual reality [VR] and augmented reality [AR] systems) that have been subjected to distortions. We also study the ability of video quality models to predict human judgments. As streaming human avatar videos in VR or AR become increasingly common, the need for more advanced human avatar video compression protocols will be required to address the tradeoffs between faithfully transmitting high-quality visual representations while adjusting to changeable bandwidth scenarios. During transmission over the internet, the perceived quality of compressed human avatar videos can be severely impaired by visual artifacts. To optimize trade-offs between perceptual quality and data volume in practical workflows, video quality assessment (VQA) models are essential tools. However, there are very few VQA algorithms developed specifically to analyze human body avatar videos, due, at least in part, to the dearth of appropriate and comprehensive datasets of adequate size. Towards filling this gap, we introduce the LIVE-Meta Rendered Human Avatar VQA Database, which contains 720 human avatar videos processed using 20 different combinations of encoding parameters, labeled by corresponding human perceptual quality judgments that were collected in six degrees of freedom VR headsets. To demonstrate the usefulness of this new and unique video resource, we use it to study and compare the performances of a variety of state-of-the-art Full Reference and No Reference video quality prediction models, including a new model called HoloQA. As a service to the research community, we publicly releases the metadata of the new database at \n<uri>https://live.ece.utexas.edu/research/LIVE-Meta-rendered-human-avatar/index.html</uri>\n.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5740-5754"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Subjective and Objective Quality Assessment of Rendered Human Avatar Videos in Virtual Reality\",\"authors\":\"Yu-Chih Chen;Avinab Saha;Alexandre Chapiro;Christian Häne;Jean-Charles Bazin;Bo Qiu;Stefano Zanetti;Ioannis Katsavounidis;Alan C. Bovik\",\"doi\":\"10.1109/TIP.2024.3468881\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We study the visual quality judgments of human subjects on digital human avatars (sometimes referred to as “holograms” in the parlance of virtual reality [VR] and augmented reality [AR] systems) that have been subjected to distortions. We also study the ability of video quality models to predict human judgments. As streaming human avatar videos in VR or AR become increasingly common, the need for more advanced human avatar video compression protocols will be required to address the tradeoffs between faithfully transmitting high-quality visual representations while adjusting to changeable bandwidth scenarios. During transmission over the internet, the perceived quality of compressed human avatar videos can be severely impaired by visual artifacts. To optimize trade-offs between perceptual quality and data volume in practical workflows, video quality assessment (VQA) models are essential tools. However, there are very few VQA algorithms developed specifically to analyze human body avatar videos, due, at least in part, to the dearth of appropriate and comprehensive datasets of adequate size. Towards filling this gap, we introduce the LIVE-Meta Rendered Human Avatar VQA Database, which contains 720 human avatar videos processed using 20 different combinations of encoding parameters, labeled by corresponding human perceptual quality judgments that were collected in six degrees of freedom VR headsets. To demonstrate the usefulness of this new and unique video resource, we use it to study and compare the performances of a variety of state-of-the-art Full Reference and No Reference video quality prediction models, including a new model called HoloQA. As a service to the research community, we publicly releases the metadata of the new database at \\n<uri>https://live.ece.utexas.edu/research/LIVE-Meta-rendered-human-avatar/index.html</uri>\\n.\",\"PeriodicalId\":94032,\"journal\":{\"name\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"volume\":\"33 \",\"pages\":\"5740-5754\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10704572/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10704572/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们研究了人类受试者对数字人类化身（有时在虚拟现实 [VR] 和增强现实 [AR] 系统中被称为 "全息图"）的视觉质量判断，这些化身都受到了扭曲。我们还研究了视频质量模型预测人类判断的能力。随着 VR 或 AR 中的人类头像视频流变得越来越普遍，我们需要更先进的人类头像视频压缩协议，以解决在忠实传输高质量视觉呈现的同时又能适应多变带宽场景之间的权衡问题。在互联网传输过程中，压缩后的人类头像视频的感知质量可能会受到视觉伪影的严重影响。为了在实际工作流程中优化感知质量和数据量之间的权衡，视频质量评估（VQA）模型是必不可少的工具。然而，专门为分析人体头像视频而开发的 VQA 算法却寥寥无几，至少部分原因是缺乏适当规模的合适综合数据集。为了填补这一空白，我们引入了 LIVE-Meta 渲染人体头像 VQA 数据库，该数据库包含 720 个使用 20 种不同编码参数组合处理的人体头像视频，并标注了相应的人类感知质量判断，这些判断是在六自由度 VR 头显中收集的。为了证明这一新的、独特的视频资源的实用性，我们利用它来研究和比较各种最先进的 "完全参考 "和 "无参考 "视频质量预测模型（包括名为 HoloQA 的新模型）的性能。作为对研究界的一项服务，我们在 https://live.ece.utexas.edu/research/LIVE-Meta-rendered-human-avatar/index.html 网站上公开发布了新数据库的元数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Subjective and Objective Quality Assessment of Rendered Human Avatar Videos in Virtual Reality

We study the visual quality judgments of human subjects on digital human avatars (sometimes referred to as “holograms” in the parlance of virtual reality [VR] and augmented reality [AR] systems) that have been subjected to distortions. We also study the ability of video quality models to predict human judgments. As streaming human avatar videos in VR or AR become increasingly common, the need for more advanced human avatar video compression protocols will be required to address the tradeoffs between faithfully transmitting high-quality visual representations while adjusting to changeable bandwidth scenarios. During transmission over the internet, the perceived quality of compressed human avatar videos can be severely impaired by visual artifacts. To optimize trade-offs between perceptual quality and data volume in practical workflows, video quality assessment (VQA) models are essential tools. However, there are very few VQA algorithms developed specifically to analyze human body avatar videos, due, at least in part, to the dearth of appropriate and comprehensive datasets of adequate size. Towards filling this gap, we introduce the LIVE-Meta Rendered Human Avatar VQA Database, which contains 720 human avatar videos processed using 20 different combinations of encoding parameters, labeled by corresponding human perceptual quality judgments that were collected in six degrees of freedom VR headsets. To demonstrate the usefulness of this new and unique video resource, we use it to study and compare the performances of a variety of state-of-the-art Full Reference and No Reference video quality prediction models, including a new model called HoloQA. As a service to the research community, we publicly releases the metadata of the new database at https://live.ece.utexas.edu/research/LIVE-Meta-rendered-human-avatar/index.html .

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

自引率

0.00%

发文量