Learning Physically Simulated Tennis Skills from Broadcast Videos

Haotian Zhang, Ye Yuan, Viktor Makoviychuk, Yunrong Guo, S. Fidler, X. B. Peng, K. Fatahalian
{"title":"Learning Physically Simulated Tennis Skills from Broadcast Videos","authors":"Haotian Zhang, Ye Yuan, Viktor Makoviychuk, Yunrong Guo, S. Fidler, X. B. Peng, K. Fatahalian","doi":"10.1145/3592408","DOIUrl":null,"url":null,"abstract":"We present a system that learns diverse, physically simulated tennis skills from large-scale demonstrations of tennis play harvested from broadcast videos. Our approach is built upon hierarchical models, combining a low-level imitation policy and a high-level motion planning policy to steer the character in a motion embedding learned from broadcast videos. When deployed at scale on large video collections that encompass a vast set of examples of real-world tennis play, our approach can learn complex tennis shotmaking skills and realistically chain together multiple shots into extended rallies, using only simple rewards and without explicit annotations of stroke types. To address the low quality of motions extracted from broadcast videos, we correct estimated motion with physics-based imitation, and use a hybrid control policy that overrides erroneous aspects of the learned motion embedding with corrections predicted by the high-level policy. We demonstrate that our system produces controllers for physically-simulated tennis players that can hit the incoming ball to target positions accurately using a diverse array of strokes (serves, forehands, and backhands), spins (topspins and slices), and playing styles (one/two-handed backhands, left/right-handed play). Overall, our system can synthesize two physically simulated characters playing extended tennis rallies with simulated racket and ball dynamics. Code and data for this work is available at https://research.nvidia.com/labs/toronto-ai/vid2player3d/.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":"16 1","pages":"1 - 14"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Graphics (TOG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3592408","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

We present a system that learns diverse, physically simulated tennis skills from large-scale demonstrations of tennis play harvested from broadcast videos. Our approach is built upon hierarchical models, combining a low-level imitation policy and a high-level motion planning policy to steer the character in a motion embedding learned from broadcast videos. When deployed at scale on large video collections that encompass a vast set of examples of real-world tennis play, our approach can learn complex tennis shotmaking skills and realistically chain together multiple shots into extended rallies, using only simple rewards and without explicit annotations of stroke types. To address the low quality of motions extracted from broadcast videos, we correct estimated motion with physics-based imitation, and use a hybrid control policy that overrides erroneous aspects of the learned motion embedding with corrections predicted by the high-level policy. We demonstrate that our system produces controllers for physically-simulated tennis players that can hit the incoming ball to target positions accurately using a diverse array of strokes (serves, forehands, and backhands), spins (topspins and slices), and playing styles (one/two-handed backhands, left/right-handed play). Overall, our system can synthesize two physically simulated characters playing extended tennis rallies with simulated racket and ball dynamics. Code and data for this work is available at https://research.nvidia.com/labs/toronto-ai/vid2player3d/.
从广播视频中学习模拟网球技术
我们提出了一个系统,该系统可以从广播视频中收集的大规模网球比赛演示中学习各种物理模拟网球技能。我们的方法建立在分层模型之上,结合了低级模仿策略和高级运动规划策略来引导从广播视频中学习的运动嵌入中的角色。当大规模部署在包含大量现实世界网球比赛示例的大型视频集上时,我们的方法可以学习复杂的网球击球技巧,并实际地将多个击球连接到一起,仅使用简单的奖励,而无需明确的击球类型注释。为了解决从广播视频中提取的低质量运动,我们使用基于物理的模仿来纠正估计的运动,并使用混合控制策略,该策略使用高级策略预测的纠正来覆盖学习到的运动嵌入的错误方面。我们证明了我们的系统为物理模拟的网球运动员产生控制器,可以使用各种击球(发球,正手和反手),旋转(上旋球和切线)和打球风格(单/双手反手,左手/右手)准确地将来球击中目标位置。总的来说,我们的系统可以合成两个物理模拟人物,通过模拟球拍和球的动力学来进行延长的网球比赛。这项工作的代码和数据可在https://research.nvidia.com/labs/toronto-ai/vid2player3d/上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信