对带有非空间和空间音频的 360° 视频进行体验质量和视觉注意力评估

IF 5.2 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Amit Hirway, Yuansong Qiao, Niall Murray
{"title":"对带有非空间和空间音频的 360° 视频进行体验质量和视觉注意力评估","authors":"Amit Hirway, Yuansong Qiao, Niall Murray","doi":"10.1145/3650208","DOIUrl":null,"url":null,"abstract":"<p>This article presents the results of an empirical study that aimed to investigate the influence of various types of audio (spatial and non-spatial) on the user quality of experience (QoE) of and visual attention in 360° videos. The study compared the head pose, eye gaze, pupil dilations, heart rate and subjective responses of 73 users who watched ten 360° videos with different sound configurations. The configurations evaluated were no sound; non-spatial (stereo) audio; and two spatial sound conditions (first and third-order ambisonics). The videos covered various categories and presented both indoor and outdoor scenarios. The subjective responses were analyzed using an ANOVA (Analysis of Variance) to assess mean differences between sound conditions. Data visualization was also employed to enhance the interpretability of the results. The findings reveal diverse viewing patterns, physiological responses, and subjective experiences among users watching 360° videos with different sound conditions. Spatial audio, in particular third-order ambisonics, garnered heightened attention. This is evident in increased pupil dilation and heart rate. Furthermore, the presence of spatial audio led to more diverse head poses when sound sources were distributed across the scene. These findings have important implications for the development of effective techniques for optimizing processing, encoding, distributing, and rendering content in VR and 360° videos with spatialized audio. These insights are also relevant in the creative realms of content design and enhancement. They provide valuable guidance on how spatial audio influences user attention, physiological responses, and overall subjective experiences. Understanding these dynamics can assist content creators and designers in crafting immersive experiences that leverage spatialized audio to captivate users, enhance engagement, and optimize the overall quality of virtual reality and 360° video content. The dataset, scripts used for data collection, ffmpeg commands used for processing the videos and the subjective questionnaire and its statistical analysis are publicly available.</p>","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"43 1","pages":""},"PeriodicalIF":5.2000,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Quality of Experience and Visual Attention Evaluation for 360° videos with non-spatial and spatial audio\",\"authors\":\"Amit Hirway, Yuansong Qiao, Niall Murray\",\"doi\":\"10.1145/3650208\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>This article presents the results of an empirical study that aimed to investigate the influence of various types of audio (spatial and non-spatial) on the user quality of experience (QoE) of and visual attention in 360° videos. The study compared the head pose, eye gaze, pupil dilations, heart rate and subjective responses of 73 users who watched ten 360° videos with different sound configurations. The configurations evaluated were no sound; non-spatial (stereo) audio; and two spatial sound conditions (first and third-order ambisonics). The videos covered various categories and presented both indoor and outdoor scenarios. The subjective responses were analyzed using an ANOVA (Analysis of Variance) to assess mean differences between sound conditions. Data visualization was also employed to enhance the interpretability of the results. The findings reveal diverse viewing patterns, physiological responses, and subjective experiences among users watching 360° videos with different sound conditions. Spatial audio, in particular third-order ambisonics, garnered heightened attention. This is evident in increased pupil dilation and heart rate. Furthermore, the presence of spatial audio led to more diverse head poses when sound sources were distributed across the scene. These findings have important implications for the development of effective techniques for optimizing processing, encoding, distributing, and rendering content in VR and 360° videos with spatialized audio. These insights are also relevant in the creative realms of content design and enhancement. They provide valuable guidance on how spatial audio influences user attention, physiological responses, and overall subjective experiences. Understanding these dynamics can assist content creators and designers in crafting immersive experiences that leverage spatialized audio to captivate users, enhance engagement, and optimize the overall quality of virtual reality and 360° video content. The dataset, scripts used for data collection, ffmpeg commands used for processing the videos and the subjective questionnaire and its statistical analysis are publicly available.</p>\",\"PeriodicalId\":50937,\"journal\":{\"name\":\"ACM Transactions on Multimedia Computing Communications and Applications\",\"volume\":\"43 1\",\"pages\":\"\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2024-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Multimedia Computing Communications and Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3650208\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Multimedia Computing Communications and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3650208","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

本文介绍了一项实证研究的结果,该研究旨在调查各种类型的音频(空间和非空间)对 360° 视频的用户体验质量(QoE)和视觉注意力的影响。该研究比较了 73 位用户在观看 10 个不同声音配置的 360° 视频时的头部姿势、眼睛注视、瞳孔放大、心率和主观反应。评估的配置包括无声、非空间(立体声)音频和两种空间声音条件(一阶和三阶环境声)。这些视频涵盖了各种类别,并呈现了室内和室外场景。使用方差分析对主观反应进行分析,以评估不同声音条件下的平均差异。此外,还采用了数据可视化方法来提高结果的可解释性。研究结果揭示了用户在不同声音条件下观看 360° 视频时的不同观看模式、生理反应和主观体验。空间音频,尤其是三阶环境声,获得了更高的关注度。这表现在瞳孔放大和心率加快上。此外,当声源分布在整个场景中时,空间音频的存在会导致头部姿势更加多样化。这些发现对于开发有效的技术以优化处理、编码、分发和渲染带有空间音频的 VR 和 360° 视频内容具有重要意义。这些见解也与内容设计和增强的创意领域相关。它们为空间音频如何影响用户注意力、生理反应和整体主观体验提供了宝贵的指导。了解这些动态变化有助于内容创作者和设计师利用空间音频打造身临其境的体验,从而吸引用户、提高参与度并优化虚拟现实和 360° 视频内容的整体质量。数据集、用于数据收集的脚本、用于处理视频的 ffmpeg 命令、主观问卷及其统计分析均可公开获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Quality of Experience and Visual Attention Evaluation for 360° videos with non-spatial and spatial audio

This article presents the results of an empirical study that aimed to investigate the influence of various types of audio (spatial and non-spatial) on the user quality of experience (QoE) of and visual attention in 360° videos. The study compared the head pose, eye gaze, pupil dilations, heart rate and subjective responses of 73 users who watched ten 360° videos with different sound configurations. The configurations evaluated were no sound; non-spatial (stereo) audio; and two spatial sound conditions (first and third-order ambisonics). The videos covered various categories and presented both indoor and outdoor scenarios. The subjective responses were analyzed using an ANOVA (Analysis of Variance) to assess mean differences between sound conditions. Data visualization was also employed to enhance the interpretability of the results. The findings reveal diverse viewing patterns, physiological responses, and subjective experiences among users watching 360° videos with different sound conditions. Spatial audio, in particular third-order ambisonics, garnered heightened attention. This is evident in increased pupil dilation and heart rate. Furthermore, the presence of spatial audio led to more diverse head poses when sound sources were distributed across the scene. These findings have important implications for the development of effective techniques for optimizing processing, encoding, distributing, and rendering content in VR and 360° videos with spatialized audio. These insights are also relevant in the creative realms of content design and enhancement. They provide valuable guidance on how spatial audio influences user attention, physiological responses, and overall subjective experiences. Understanding these dynamics can assist content creators and designers in crafting immersive experiences that leverage spatialized audio to captivate users, enhance engagement, and optimize the overall quality of virtual reality and 360° video content. The dataset, scripts used for data collection, ffmpeg commands used for processing the videos and the subjective questionnaire and its statistical analysis are publicly available.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
8.50
自引率
5.90%
发文量
285
审稿时长
7.5 months
期刊介绍: The ACM Transactions on Multimedia Computing, Communications, and Applications is the flagship publication of the ACM Special Interest Group in Multimedia (SIGMM). It is soliciting paper submissions on all aspects of multimedia. Papers on single media (for instance, audio, video, animation) and their processing are also welcome. TOMM is a peer-reviewed, archival journal, available in both print form and digital form. The Journal is published quarterly; with roughly 7 23-page articles in each issue. In addition, all Special Issues are published online-only to ensure a timely publication. The transactions consists primarily of research papers. This is an archival journal and it is intended that the papers will have lasting importance and value over time. In general, papers whose primary focus is on particular multimedia products or the current state of the industry will not be included.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信