从2D到3D视频会议:沉浸式扩展现实(XR)通信中交互式自然用户表示的模块化RGB-D捕获和重建

IF 1.3 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC

Frontiers in signal processing Pub Date : 2023-05-22 DOI:10.3389/frsip.2023.1139897

S. Gunkel, S. Dijkstra-Soudarissanane, H. Stokking, O. Niamut

{"title":"从2D到3D视频会议:沉浸式扩展现实(XR)通信中交互式自然用户表示的模块化RGB-D捕获和重建","authors":"S. Gunkel, S. Dijkstra-Soudarissanane, H. Stokking, O. Niamut","doi":"10.3389/frsip.2023.1139897","DOIUrl":null,"url":null,"abstract":"With recent advancements in Virtual Reality (VR) and Augmented Reality (AR) hardware, many new immersive Extended Reality (XR) applications and services arose. One challenge that remains is to solve the social isolation often felt in these extended reality experiences and to enable a natural multi-user communication with high Social Presence. While a multitude of solutions exist to address this issue with computer-generated “artificial” avatars (based on pre-rendered 3D models), this form of user representation might not be sufficient for conveying a sense of co-presence for many use cases. In particular, for personal communication (for example, with family, doctor, or sales representatives) or for applications requiring photorealistic rendering. One alternative solution is to capture users (and objects) with the help of RGBD sensors to allow real-time photorealistic representations of users. In this paper, we present a complete and modular RGBD capture application and outline the different steps needed to utilize RGBD as means of photorealistic 3D user representations. We outline different capture modalities, as well as individual functional processing blocks, with its advantages and disadvantages. We evaluate our approach in two ways, a technical evaluation of the operation of the different modules and two small-scale user evaluations within integrated applications. The integrated applications present the use of the modular RGBD capture in both augmented reality and virtual reality communication application use cases, tested in realistic real-world settings. Our examples show that the proposed modular capture and reconstruction pipeline allows for easy evaluation and extension of each step of the processing pipeline. Furthermore, it allows parallel code execution, keeping performance overhead and delay low. Finally, our proposed methods show that an integration of 3D photorealistic user representations into existing video communication transmission systems is feasible and allows for new immersive extended reality applications.","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"122 1","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2023-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"From 2D to 3D video conferencing: modular RGB-D capture and reconstruction for interactive natural user representations in immersive extended reality (XR) communication\",\"authors\":\"S. Gunkel, S. Dijkstra-Soudarissanane, H. Stokking, O. Niamut\",\"doi\":\"10.3389/frsip.2023.1139897\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With recent advancements in Virtual Reality (VR) and Augmented Reality (AR) hardware, many new immersive Extended Reality (XR) applications and services arose. One challenge that remains is to solve the social isolation often felt in these extended reality experiences and to enable a natural multi-user communication with high Social Presence. While a multitude of solutions exist to address this issue with computer-generated “artificial” avatars (based on pre-rendered 3D models), this form of user representation might not be sufficient for conveying a sense of co-presence for many use cases. In particular, for personal communication (for example, with family, doctor, or sales representatives) or for applications requiring photorealistic rendering. One alternative solution is to capture users (and objects) with the help of RGBD sensors to allow real-time photorealistic representations of users. In this paper, we present a complete and modular RGBD capture application and outline the different steps needed to utilize RGBD as means of photorealistic 3D user representations. We outline different capture modalities, as well as individual functional processing blocks, with its advantages and disadvantages. We evaluate our approach in two ways, a technical evaluation of the operation of the different modules and two small-scale user evaluations within integrated applications. The integrated applications present the use of the modular RGBD capture in both augmented reality and virtual reality communication application use cases, tested in realistic real-world settings. Our examples show that the proposed modular capture and reconstruction pipeline allows for easy evaluation and extension of each step of the processing pipeline. Furthermore, it allows parallel code execution, keeping performance overhead and delay low. Finally, our proposed methods show that an integration of 3D photorealistic user representations into existing video communication transmission systems is feasible and allows for new immersive extended reality applications.\",\"PeriodicalId\":93557,\"journal\":{\"name\":\"Frontiers in signal processing\",\"volume\":\"122 1\",\"pages\":\"\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2023-05-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in signal processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/frsip.2023.1139897\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in signal processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frsip.2023.1139897","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 2

摘要

随着虚拟现实(VR)和增强现实(AR)硬件的最新发展，出现了许多新的沉浸式扩展现实(XR)应用程序和服务。仍然存在的一个挑战是解决在这些扩展现实体验中经常感到的社会隔离，并使具有高社交存在的自然多用户通信成为可能。虽然有许多解决方案可以通过计算机生成的“人工”化身(基于预渲染的3D模型)来解决这个问题，但对于许多用例来说，这种形式的用户表示可能不足以传达共同存在的感觉。特别是，对于个人通信(例如，与家人、医生或销售代表)或需要逼真渲染的应用程序。另一种解决方案是在RGBD传感器的帮助下捕获用户(和对象)，以实现用户的实时逼真表示。在本文中，我们提出了一个完整的模块化RGBD捕获应用程序，并概述了利用RGBD作为逼真3D用户表示手段所需的不同步骤。我们概述了不同的捕获方式，以及单个功能处理块，其优点和缺点。我们以两种方式评估我们的方法，一种是对不同模块操作的技术评估，另一种是对综合应用程序中的两种小规模用户评估。集成应用程序在增强现实和虚拟现实通信应用用例中展示了模块化RGBD捕获的使用，并在现实世界环境中进行了测试。我们的示例表明，所提出的模块化捕获和重建管道允许轻松评估和扩展处理管道的每个步骤。此外，它允许并行代码执行，保持较低的性能开销和延迟。最后，我们提出的方法表明，将3D逼真的用户表示集成到现有的视频通信传输系统中是可行的，并允许新的沉浸式扩展现实应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

From 2D to 3D video conferencing: modular RGB-D capture and reconstruction for interactive natural user representations in immersive extended reality (XR) communication

With recent advancements in Virtual Reality (VR) and Augmented Reality (AR) hardware, many new immersive Extended Reality (XR) applications and services arose. One challenge that remains is to solve the social isolation often felt in these extended reality experiences and to enable a natural multi-user communication with high Social Presence. While a multitude of solutions exist to address this issue with computer-generated “artificial” avatars (based on pre-rendered 3D models), this form of user representation might not be sufficient for conveying a sense of co-presence for many use cases. In particular, for personal communication (for example, with family, doctor, or sales representatives) or for applications requiring photorealistic rendering. One alternative solution is to capture users (and objects) with the help of RGBD sensors to allow real-time photorealistic representations of users. In this paper, we present a complete and modular RGBD capture application and outline the different steps needed to utilize RGBD as means of photorealistic 3D user representations. We outline different capture modalities, as well as individual functional processing blocks, with its advantages and disadvantages. We evaluate our approach in two ways, a technical evaluation of the operation of the different modules and two small-scale user evaluations within integrated applications. The integrated applications present the use of the modular RGBD capture in both augmented reality and virtual reality communication application use cases, tested in realistic real-world settings. Our examples show that the proposed modular capture and reconstruction pipeline allows for easy evaluation and extension of each step of the processing pipeline. Furthermore, it allows parallel code execution, keeping performance overhead and delay low. Finally, our proposed methods show that an integration of 3D photorealistic user representations into existing video communication transmission systems is feasible and allows for new immersive extended reality applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Frontiers in signal processing

自引率

0.00%

发文量