人工智能介导的3D视频会议

ACM SIGGRAPH 2023 Emerging Technologies Pub Date : 2023-07-26 DOI:10.1145/3588037.3595385

Michael Stengel, Koki Nagano, Chao Liu, Matthew Chan, Alex Trevithick, Shalini De Mello, Jonghyun Kim, D. Luebke

{"title":"人工智能介导的3D视频会议","authors":"Michael Stengel, Koki Nagano, Chao Liu, Matthew Chan, Alex Trevithick, Shalini De Mello, Jonghyun Kim, D. Luebke","doi":"10.1145/3588037.3595385","DOIUrl":null,"url":null,"abstract":"We present an AI-mediated 3D video conferencing system that can reconstruct and autostereoscopically display a life-sized talking head using consumer-grade compute resources and minimal capture equipment. Our 3D capture uses a novel 3D lifting method that encodes a given 2D input into an efficient triplanar neural representation of the user, which can be rendered from novel viewpoints in real-time. Our AI-based techniques drastically reduce the cost for 3D capture, while providing a high-fidelity 3D representation on the receiver’s end at the cost of traditional 2D video streaming. Additional advantages of our AI-based approach include the ability to accommodate both photorealistic and stylized avatars, and the ability to enable mutual eye contact in multi-directional video conferencing. We demonstrate our system using a tracked stereo display for a personal viewing experience as well as a lightfield display for a room-scale multi-viewer experience.","PeriodicalId":348151,"journal":{"name":"ACM SIGGRAPH 2023 Emerging Technologies","volume":"178 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AI-Mediated 3D Video Conferencing\",\"authors\":\"Michael Stengel, Koki Nagano, Chao Liu, Matthew Chan, Alex Trevithick, Shalini De Mello, Jonghyun Kim, D. Luebke\",\"doi\":\"10.1145/3588037.3595385\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present an AI-mediated 3D video conferencing system that can reconstruct and autostereoscopically display a life-sized talking head using consumer-grade compute resources and minimal capture equipment. Our 3D capture uses a novel 3D lifting method that encodes a given 2D input into an efficient triplanar neural representation of the user, which can be rendered from novel viewpoints in real-time. Our AI-based techniques drastically reduce the cost for 3D capture, while providing a high-fidelity 3D representation on the receiver’s end at the cost of traditional 2D video streaming. Additional advantages of our AI-based approach include the ability to accommodate both photorealistic and stylized avatars, and the ability to enable mutual eye contact in multi-directional video conferencing. We demonstrate our system using a tracked stereo display for a personal viewing experience as well as a lightfield display for a room-scale multi-viewer experience.\",\"PeriodicalId\":348151,\"journal\":{\"name\":\"ACM SIGGRAPH 2023 Emerging Technologies\",\"volume\":\"178 \",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM SIGGRAPH 2023 Emerging Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3588037.3595385\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM SIGGRAPH 2023 Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3588037.3595385","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们提出了一种人工智能介导的3D视频会议系统，该系统可以使用消费级计算资源和最小的捕获设备重建和自动立体显示真人大小的说话头。我们的3D捕获使用了一种新颖的3D提升方法，该方法将给定的2D输入编码为用户的有效三平面神经表示，可以从新的视点实时渲染。我们基于人工智能的技术大大降低了3D捕获的成本，同时以传统2D视频流的成本在接收器端提供高保真的3D表示。我们基于人工智能的方法的其他优点包括能够适应逼真的和风格化的化身，以及能够在多向视频会议中实现相互眼神接触的能力。我们使用跟踪立体显示器来演示我们的系统，用于个人观看体验，以及用于房间规模的多观众体验的光场显示器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

AI-Mediated 3D Video Conferencing

We present an AI-mediated 3D video conferencing system that can reconstruct and autostereoscopically display a life-sized talking head using consumer-grade compute resources and minimal capture equipment. Our 3D capture uses a novel 3D lifting method that encodes a given 2D input into an efficient triplanar neural representation of the user, which can be rendered from novel viewpoints in real-time. Our AI-based techniques drastically reduce the cost for 3D capture, while providing a high-fidelity 3D representation on the receiver’s end at the cost of traditional 2D video streaming. Additional advantages of our AI-based approach include the ability to accommodate both photorealistic and stylized avatars, and the ability to enable mutual eye contact in multi-directional video conferencing. We demonstrate our system using a tracked stereo display for a personal viewing experience as well as a lightfield display for a room-scale multi-viewer experience.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM SIGGRAPH 2023 Emerging Technologies

自引率

0.00%

发文量