使用卷积神经网络的动态数字人的时间插值

2019 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR) Pub Date : 2019-12-01 DOI:10.1109/AIVR46125.2019.00022

Irene Viola, J. Mulder, F. D. Simone, Pablo César

{"title":"使用卷积神经网络的动态数字人的时间插值","authors":"Irene Viola, J. Mulder, F. D. Simone, Pablo César","doi":"10.1109/AIVR46125.2019.00022","DOIUrl":null,"url":null,"abstract":"In recent years, there has been an increased interest in point cloud representation for visualizing digital humans in cross reality. However, due to their voluminous size, point clouds require high bandwidth to be transmitted. In this paper, we propose a temporal interpolation architecture capable of increasing the temporal resolution of dynamic digital humans, represented using point clouds. With this technique, bandwidth savings can be achieved by transmitting dynamic point clouds in a lower temporal resolution, and recreating a higher temporal resolution on the receiving side. Our interpolation architecture works by first downsampling the point clouds to a lower spatial resolution, then estimating scene flow using a newly designed neural network architecture, and finally upsampling the result back to the original spatial resolution. To improve the smoothness of the results, we additionally apply a novel technique called neighbour snapping. To be able to train and test our newly designed network, we created a synthetic point cloud data set of animated human bodies. Results from the evaluation of our architecture through a small-scale user study show the benefits of our method with respect to the state of the art in scene flow estimation for point clouds. Moreover, correlation between our user study and existing objective quality metrics confirm the need for new metrics to accurately predict the visual quality of point cloud contents.","PeriodicalId":274566,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)","volume":"37 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Temporal Interpolation of Dynamic Digital Humans using Convolutional Neural Networks\",\"authors\":\"Irene Viola, J. Mulder, F. D. Simone, Pablo César\",\"doi\":\"10.1109/AIVR46125.2019.00022\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, there has been an increased interest in point cloud representation for visualizing digital humans in cross reality. However, due to their voluminous size, point clouds require high bandwidth to be transmitted. In this paper, we propose a temporal interpolation architecture capable of increasing the temporal resolution of dynamic digital humans, represented using point clouds. With this technique, bandwidth savings can be achieved by transmitting dynamic point clouds in a lower temporal resolution, and recreating a higher temporal resolution on the receiving side. Our interpolation architecture works by first downsampling the point clouds to a lower spatial resolution, then estimating scene flow using a newly designed neural network architecture, and finally upsampling the result back to the original spatial resolution. To improve the smoothness of the results, we additionally apply a novel technique called neighbour snapping. To be able to train and test our newly designed network, we created a synthetic point cloud data set of animated human bodies. Results from the evaluation of our architecture through a small-scale user study show the benefits of our method with respect to the state of the art in scene flow estimation for point clouds. Moreover, correlation between our user study and existing objective quality metrics confirm the need for new metrics to accurately predict the visual quality of point cloud contents.\",\"PeriodicalId\":274566,\"journal\":{\"name\":\"2019 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)\",\"volume\":\"37 \",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIVR46125.2019.00022\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIVR46125.2019.00022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

近年来，人们对点云表示在交叉现实中可视化数字人的兴趣越来越大。然而，由于它们体积庞大，点云需要高带宽来传输。在本文中，我们提出了一种时间插值架构，能够提高动态数字人的时间分辨率，用点云表示。使用这种技术，可以通过以较低的时间分辨率传输动态点云，并在接收端重新创建较高的时间分辨率来节省带宽。我们的插值架构首先将点云降采样到较低的空间分辨率，然后使用新设计的神经网络架构估计场景流，最后将结果上采样回原始空间分辨率。为了提高结果的平滑性，我们还应用了一种称为邻居捕捉的新技术。为了能够训练和测试我们新设计的网络，我们创建了一个合成的动画人体点云数据集。通过小规模用户研究对我们的架构进行评估的结果表明，我们的方法在点云的场景流估计方面具有先进的优势。此外，我们的用户研究与现有客观质量指标之间的相关性证实了需要新的指标来准确预测点云内容的视觉质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Temporal Interpolation of Dynamic Digital Humans using Convolutional Neural Networks

In recent years, there has been an increased interest in point cloud representation for visualizing digital humans in cross reality. However, due to their voluminous size, point clouds require high bandwidth to be transmitted. In this paper, we propose a temporal interpolation architecture capable of increasing the temporal resolution of dynamic digital humans, represented using point clouds. With this technique, bandwidth savings can be achieved by transmitting dynamic point clouds in a lower temporal resolution, and recreating a higher temporal resolution on the receiving side. Our interpolation architecture works by first downsampling the point clouds to a lower spatial resolution, then estimating scene flow using a newly designed neural network architecture, and finally upsampling the result back to the original spatial resolution. To improve the smoothness of the results, we additionally apply a novel technique called neighbour snapping. To be able to train and test our newly designed network, we created a synthetic point cloud data set of animated human bodies. Results from the evaluation of our architecture through a small-scale user study show the benefits of our method with respect to the state of the art in scene flow estimation for point clouds. Moreover, correlation between our user study and existing objective quality metrics confirm the need for new metrics to accurately predict the visual quality of point cloud contents.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)

自引率

0.00%

发文量