追踪:在3D环境中使用动态相机的化身的5D时间回归

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2023-06-01 DOI:10.1109/CVPR52729.2023.00855

Yu Sun, Qian Bao, Wu Liu, Tao Mei, Michael J. Black

{"title":"追踪:在3D环境中使用动态相机的化身的5D时间回归","authors":"Yu Sun, Qian Bao, Wu Liu, Tao Mei, Michael J. Black","doi":"10.1109/CVPR52729.2023.00855","DOIUrl":null,"url":null,"abstract":"Although the estimation of 3D human pose and shape (HPS) is rapidly progressing, current methods still cannot reliably estimate moving humans in global coordinates, which is critical for many applications. This is particularly challenging when the camera is also moving, entangling human and camera motion. To address these issues, we adopt a novel 5D representation (space, time, and identity) that enables end-to-end reasoning about people in scenes. Our method, called TRACE, introduces several novel architectural components. Most importantly, it uses two new “maps” to reason about the 3D trajectory of people over time in camera, and world, coordinates. An additional memory unit enables persistent tracking of people even during long occlusions. TRACE is the first one-stage method to jointly recover and track 3D humans in global coordinates from dynamic cameras. By training it end-to-end, and using full image information, TRACE achieves state-of-the-art performance on tracking and HPS benchmarks. The code11https://www.yusun.work/TRACE/TRACE.html and dataset22https://github.com/Arthur151/DynaCam are released for research purposes.","PeriodicalId":376416,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments\",\"authors\":\"Yu Sun, Qian Bao, Wu Liu, Tao Mei, Michael J. Black\",\"doi\":\"10.1109/CVPR52729.2023.00855\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although the estimation of 3D human pose and shape (HPS) is rapidly progressing, current methods still cannot reliably estimate moving humans in global coordinates, which is critical for many applications. This is particularly challenging when the camera is also moving, entangling human and camera motion. To address these issues, we adopt a novel 5D representation (space, time, and identity) that enables end-to-end reasoning about people in scenes. Our method, called TRACE, introduces several novel architectural components. Most importantly, it uses two new “maps” to reason about the 3D trajectory of people over time in camera, and world, coordinates. An additional memory unit enables persistent tracking of people even during long occlusions. TRACE is the first one-stage method to jointly recover and track 3D humans in global coordinates from dynamic cameras. By training it end-to-end, and using full image information, TRACE achieves state-of-the-art performance on tracking and HPS benchmarks. The code11https://www.yusun.work/TRACE/TRACE.html and dataset22https://github.com/Arthur151/DynaCam are released for research purposes.\",\"PeriodicalId\":376416,\"journal\":{\"name\":\"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR52729.2023.00855\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR52729.2023.00855","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

尽管三维人体姿态和形状(HPS)的估计正在迅速发展，但目前的方法仍然不能可靠地估计全局坐标下的运动人体，这对许多应用来说是至关重要的。这是特别具有挑战性的，当相机也在移动，纠缠人类和相机的运动。为了解决这些问题，我们采用了一种新颖的5D表示(空间、时间和身份)，可以对场景中的人进行端到端推理。我们的方法，称为TRACE，引入了几个新的体系结构组件。最重要的是，它使用两个新的“地图”来推断人们在相机和世界坐标中随时间的3D轨迹。一个额外的记忆单元可以在长时间闭塞的情况下持续跟踪人。TRACE是首个利用动态摄像机在全球坐标下联合恢复和跟踪三维人体的单阶段方法。通过端到端训练，并使用完整的图像信息，TRACE在跟踪和HPS基准上实现了最先进的性能。code11https://www.yusun.work/TRACE/TRACE.html和dataset22https://github.com/Arthur151/DynaCam仅供研究使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments

Although the estimation of 3D human pose and shape (HPS) is rapidly progressing, current methods still cannot reliably estimate moving humans in global coordinates, which is critical for many applications. This is particularly challenging when the camera is also moving, entangling human and camera motion. To address these issues, we adopt a novel 5D representation (space, time, and identity) that enables end-to-end reasoning about people in scenes. Our method, called TRACE, introduces several novel architectural components. Most importantly, it uses two new “maps” to reason about the 3D trajectory of people over time in camera, and world, coordinates. An additional memory unit enables persistent tracking of people even during long occlusions. TRACE is the first one-stage method to jointly recover and track 3D humans in global coordinates from dynamic cameras. By training it end-to-end, and using full image information, TRACE achieves state-of-the-art performance on tracking and HPS benchmarks. The code11https://www.yusun.work/TRACE/TRACE.html and dataset22https://github.com/Arthur151/DynaCam are released for research purposes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量