Selectable Directional Audio for Multiple Telepresence in Immersive Intelligent Environments

2013 9th International Conference on Intelligent Environments Pub Date : 2013-07-16 DOI:10.1109/IE.2013.33

Alfonso Torrejon, V. Callaghan, H. Hagras

{"title":"Selectable Directional Audio for Multiple Telepresence in Immersive Intelligent Environments","authors":"Alfonso Torrejon, V. Callaghan, H. Hagras","doi":"10.1109/IE.2013.33","DOIUrl":null,"url":null,"abstract":"The general focus of this paper concerns the development of telepresence within intelligent immersive environments. The overall aim is the development of a system that combines multiple audio and video feeds from geographically dispersed people into a single environment view, where sound appears to be linked to the appropriate visual source on a panoramic viewer based on the gaze of the user. More specifically this paper describes a novel directional audio system for telepresence which seeks to reproduce sound sources (conversations) in a panoramic viewer in their correct spatial positions to increase the realism associated with telepresence applications such as online meetings. The intention of this work is that external attendees to an online meeting would be able to move their head to focus on the video and audio stream from a particular person or group so as decrease the audio from all other streams (i.e. speakers) to a background level. The main contribution of this paper is a methodology that captures and reproduces these spatial audio and video relationships. In support of this we have created a multiple camera recording scheme to emulate the behavior of a panoramic camera, or array of cameras, at such meeting which uses the Chroma key photographic effect to integrate all streams into a common panoramic video image thereby creating a common shared virtual space. While this emulation is only implemented as an experiment, it opens the opportunity to create telepresence systems with selectable real time video and audio streaming using multiple camera arrays. Finally we report on the results of an evaluation of our spatial audio scheme that demonstrates that the techniques both work and improve the users' experience, by comparing a traditional omni directional audio scheme versus selectable directional binaural audio scenarios.","PeriodicalId":353156,"journal":{"name":"2013 9th International Conference on Intelligent Environments","volume":"20 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 9th International Conference on Intelligent Environments","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IE.2013.33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

The general focus of this paper concerns the development of telepresence within intelligent immersive environments. The overall aim is the development of a system that combines multiple audio and video feeds from geographically dispersed people into a single environment view, where sound appears to be linked to the appropriate visual source on a panoramic viewer based on the gaze of the user. More specifically this paper describes a novel directional audio system for telepresence which seeks to reproduce sound sources (conversations) in a panoramic viewer in their correct spatial positions to increase the realism associated with telepresence applications such as online meetings. The intention of this work is that external attendees to an online meeting would be able to move their head to focus on the video and audio stream from a particular person or group so as decrease the audio from all other streams (i.e. speakers) to a background level. The main contribution of this paper is a methodology that captures and reproduces these spatial audio and video relationships. In support of this we have created a multiple camera recording scheme to emulate the behavior of a panoramic camera, or array of cameras, at such meeting which uses the Chroma key photographic effect to integrate all streams into a common panoramic video image thereby creating a common shared virtual space. While this emulation is only implemented as an experiment, it opens the opportunity to create telepresence systems with selectable real time video and audio streaming using multiple camera arrays. Finally we report on the results of an evaluation of our spatial audio scheme that demonstrates that the techniques both work and improve the users' experience, by comparing a traditional omni directional audio scheme versus selectable directional binaural audio scenarios.

查看原文本刊更多论文

沉浸式智能环境中多重远程呈现的可选定向音频

本文的总体重点是关注智能沉浸式环境中远程呈现的发展。总体目标是开发一种系统，将来自地理上分散的人们的多个音频和视频馈送结合到一个单一的环境视图中，其中声音似乎与基于用户凝视的全景观看器上的适当视觉源相关联。更具体地说，本文描述了一种用于远程呈现的新型定向音频系统，该系统寻求在全景观看器中以正确的空间位置再现声源(对话)，以增加与远程呈现应用(如在线会议)相关的真实感。这项工作的目的是，外部与会者的在线会议将能够移动他们的头专注于视频和音频流从一个特定的人或组，以减少音频从所有其他流(即发言者)的背景水平。本文的主要贡献是一种捕捉和再现这些空间音频和视频关系的方法。为了支持这一点，我们创建了一个多摄像机记录方案来模拟全景摄像机或摄像机阵列的行为，在这样的会议上使用色度键摄影效果将所有流整合到一个共同的全景视频图像中，从而创建一个共同的共享虚拟空间。虽然这个仿真只是作为一个实验来实现，但它为使用多个摄像机阵列创建具有可选择的实时视频和音频流的远程呈现系统提供了机会。最后，我们报告了空间音频方案的评估结果，通过比较传统的全方位音频方案和可选择的定向双耳音频方案，该技术既有效又改善了用户体验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 9th International Conference on Intelligent Environments

自引率

0.00%

发文量