Neural Categorical Priors for Physics-Based Character Control

Qing Zhu, He Zhang, Mengting Lan, Lei Han
{"title":"Neural Categorical Priors for Physics-Based Character Control","authors":"Qing Zhu, He Zhang, Mengting Lan, Lei Han","doi":"10.1145/3618397","DOIUrl":null,"url":null,"abstract":"Recent advances in learning reusable motion priors have demonstrated their effectiveness in generating naturalistic behaviors. In this paper, we propose a new learning framework in this paradigm for controlling physics-based characters with improved motion quality and diversity over existing methods. The proposed method uses reinforcement learning (RL) to initially track and imitate life-like movements from unstructured motion clips using the discrete information bottleneck, as adopted in the Vector Quantized Variational AutoEncoder (VQ-VAE). This structure compresses the most relevant information from the motion clips into a compact yet informative latent space, i.e., a discrete space over vector quantized codes. By sampling codes in the space from a trained categorical prior distribution, high-quality life-like behaviors can be generated, similar to the usage of VQ-VAE in computer vision. Although this prior distribution can be trained with the supervision of the encoder's output, it follows the original motion clip distribution in the dataset and could lead to imbalanced behaviors in our setting. To address the issue, we further propose a technique named prior shifting to adjust the prior distribution using curiosity-driven RL. The outcome distribution is demonstrated to offer sufficient behavioral diversity and significantly facilitates upper-level policy learning for downstream tasks. We conduct comprehensive experiments using humanoid characters on two challenging downstream tasks, sword-shield striking and two-player boxing game. Our results demonstrate that the proposed framework is capable of controlling the character to perform considerably high-quality movements in terms of behavioral strategies, diversity, and realism. Videos, codes, and data are available at https://tencent-roboticsx.github.io/NCP/.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":"14 1","pages":"1 - 16"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Graphics (TOG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3618397","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recent advances in learning reusable motion priors have demonstrated their effectiveness in generating naturalistic behaviors. In this paper, we propose a new learning framework in this paradigm for controlling physics-based characters with improved motion quality and diversity over existing methods. The proposed method uses reinforcement learning (RL) to initially track and imitate life-like movements from unstructured motion clips using the discrete information bottleneck, as adopted in the Vector Quantized Variational AutoEncoder (VQ-VAE). This structure compresses the most relevant information from the motion clips into a compact yet informative latent space, i.e., a discrete space over vector quantized codes. By sampling codes in the space from a trained categorical prior distribution, high-quality life-like behaviors can be generated, similar to the usage of VQ-VAE in computer vision. Although this prior distribution can be trained with the supervision of the encoder's output, it follows the original motion clip distribution in the dataset and could lead to imbalanced behaviors in our setting. To address the issue, we further propose a technique named prior shifting to adjust the prior distribution using curiosity-driven RL. The outcome distribution is demonstrated to offer sufficient behavioral diversity and significantly facilitates upper-level policy learning for downstream tasks. We conduct comprehensive experiments using humanoid characters on two challenging downstream tasks, sword-shield striking and two-player boxing game. Our results demonstrate that the proposed framework is capable of controlling the character to perform considerably high-quality movements in terms of behavioral strategies, diversity, and realism. Videos, codes, and data are available at https://tencent-roboticsx.github.io/NCP/.
基于物理的字符控制神经分类先验
最近在学习可重复使用的运动先验方面取得的进展已经证明了它们在生成自然行为方面的有效性。在本文中,我们在这一范例中提出了一种新的学习框架,用于控制基于物理的角色,与现有方法相比,该方法的运动质量和多样性都有所提高。所提出的方法使用强化学习(RL),利用矢量量化变异自动编码器(VQ-VAE)中采用的离散信息瓶颈,从非结构化运动片段中初步跟踪和模仿栩栩如生的动作。这种结构将运动片段中最相关的信息压缩到一个紧凑但信息丰富的潜在空间,即矢量量化代码的离散空间。通过从训练有素的分类先验分布中抽取空间中的代码,可以生成高质量的栩栩如生的行为,这与计算机视觉中 VQ-VAE 的用法类似。虽然这种先验分布可以在编码器输出的监督下进行训练,但它遵循数据集中的原始运动片段分布,在我们的设置中可能会导致不平衡的行为。为了解决这个问题,我们进一步提出了一种名为 "先验转移 "的技术,利用好奇心驱动的 RL 来调整先验分布。结果表明,先验分布能提供足够的行为多样性,并极大地促进了下游任务的上层策略学习。我们利用仿人角色在剑盾击打和双人拳击游戏这两个具有挑战性的下游任务中进行了全面实验。我们的结果表明,所提出的框架能够在行为策略、多样性和逼真度方面控制角色执行相当高质量的动作。视频、代码和数据可在 https://tencent-roboticsx.github.io/NCP/ 上获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信