基于物理的字符控制神经分类先验

ACM Transactions on Graphics (TOG) Pub Date : 2023-08-14 DOI:10.1145/3618397

Qing Zhu, He Zhang, Mengting Lan, Lei Han

{"title":"基于物理的字符控制神经分类先验","authors":"Qing Zhu, He Zhang, Mengting Lan, Lei Han","doi":"10.1145/3618397","DOIUrl":null,"url":null,"abstract":"Recent advances in learning reusable motion priors have demonstrated their effectiveness in generating naturalistic behaviors. In this paper, we propose a new learning framework in this paradigm for controlling physics-based characters with improved motion quality and diversity over existing methods. The proposed method uses reinforcement learning (RL) to initially track and imitate life-like movements from unstructured motion clips using the discrete information bottleneck, as adopted in the Vector Quantized Variational AutoEncoder (VQ-VAE). This structure compresses the most relevant information from the motion clips into a compact yet informative latent space, i.e., a discrete space over vector quantized codes. By sampling codes in the space from a trained categorical prior distribution, high-quality life-like behaviors can be generated, similar to the usage of VQ-VAE in computer vision. Although this prior distribution can be trained with the supervision of the encoder's output, it follows the original motion clip distribution in the dataset and could lead to imbalanced behaviors in our setting. To address the issue, we further propose a technique named prior shifting to adjust the prior distribution using curiosity-driven RL. The outcome distribution is demonstrated to offer sufficient behavioral diversity and significantly facilitates upper-level policy learning for downstream tasks. We conduct comprehensive experiments using humanoid characters on two challenging downstream tasks, sword-shield striking and two-player boxing game. Our results demonstrate that the proposed framework is capable of controlling the character to perform considerably high-quality movements in terms of behavioral strategies, diversity, and realism. Videos, codes, and data are available at https://tencent-roboticsx.github.io/NCP/.","PeriodicalId":7077,"journal":{"name":"ACM Transactions on Graphics (TOG)","volume":"14 1","pages":"1 - 16"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Neural Categorical Priors for Physics-Based Character Control\",\"authors\":\"Qing Zhu, He Zhang, Mengting Lan, Lei Han\",\"doi\":\"10.1145/3618397\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent advances in learning reusable motion priors have demonstrated their effectiveness in generating naturalistic behaviors. In this paper, we propose a new learning framework in this paradigm for controlling physics-based characters with improved motion quality and diversity over existing methods. The proposed method uses reinforcement learning (RL) to initially track and imitate life-like movements from unstructured motion clips using the discrete information bottleneck, as adopted in the Vector Quantized Variational AutoEncoder (VQ-VAE). This structure compresses the most relevant information from the motion clips into a compact yet informative latent space, i.e., a discrete space over vector quantized codes. By sampling codes in the space from a trained categorical prior distribution, high-quality life-like behaviors can be generated, similar to the usage of VQ-VAE in computer vision. Although this prior distribution can be trained with the supervision of the encoder's output, it follows the original motion clip distribution in the dataset and could lead to imbalanced behaviors in our setting. To address the issue, we further propose a technique named prior shifting to adjust the prior distribution using curiosity-driven RL. The outcome distribution is demonstrated to offer sufficient behavioral diversity and significantly facilitates upper-level policy learning for downstream tasks. We conduct comprehensive experiments using humanoid characters on two challenging downstream tasks, sword-shield striking and two-player boxing game. Our results demonstrate that the proposed framework is capable of controlling the character to perform considerably high-quality movements in terms of behavioral strategies, diversity, and realism. Videos, codes, and data are available at https://tencent-roboticsx.github.io/NCP/.\",\"PeriodicalId\":7077,\"journal\":{\"name\":\"ACM Transactions on Graphics (TOG)\",\"volume\":\"14 1\",\"pages\":\"1 - 16\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Graphics (TOG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3618397\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Graphics (TOG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3618397","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

最近在学习可重复使用的运动先验方面取得的进展已经证明了它们在生成自然行为方面的有效性。在本文中，我们在这一范例中提出了一种新的学习框架，用于控制基于物理的角色，与现有方法相比，该方法的运动质量和多样性都有所提高。所提出的方法使用强化学习（RL），利用矢量量化变异自动编码器（VQ-VAE）中采用的离散信息瓶颈，从非结构化运动片段中初步跟踪和模仿栩栩如生的动作。这种结构将运动片段中最相关的信息压缩到一个紧凑但信息丰富的潜在空间，即矢量量化代码的离散空间。通过从训练有素的分类先验分布中抽取空间中的代码，可以生成高质量的栩栩如生的行为，这与计算机视觉中 VQ-VAE 的用法类似。虽然这种先验分布可以在编码器输出的监督下进行训练，但它遵循数据集中的原始运动片段分布，在我们的设置中可能会导致不平衡的行为。为了解决这个问题，我们进一步提出了一种名为 "先验转移 "的技术，利用好奇心驱动的 RL 来调整先验分布。结果表明，先验分布能提供足够的行为多样性，并极大地促进了下游任务的上层策略学习。我们利用仿人角色在剑盾击打和双人拳击游戏这两个具有挑战性的下游任务中进行了全面实验。我们的结果表明，所提出的框架能够在行为策略、多样性和逼真度方面控制角色执行相当高质量的动作。视频、代码和数据可在 https://tencent-roboticsx.github.io/NCP/ 上获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Neural Categorical Priors for Physics-Based Character Control

Recent advances in learning reusable motion priors have demonstrated their effectiveness in generating naturalistic behaviors. In this paper, we propose a new learning framework in this paradigm for controlling physics-based characters with improved motion quality and diversity over existing methods. The proposed method uses reinforcement learning (RL) to initially track and imitate life-like movements from unstructured motion clips using the discrete information bottleneck, as adopted in the Vector Quantized Variational AutoEncoder (VQ-VAE). This structure compresses the most relevant information from the motion clips into a compact yet informative latent space, i.e., a discrete space over vector quantized codes. By sampling codes in the space from a trained categorical prior distribution, high-quality life-like behaviors can be generated, similar to the usage of VQ-VAE in computer vision. Although this prior distribution can be trained with the supervision of the encoder's output, it follows the original motion clip distribution in the dataset and could lead to imbalanced behaviors in our setting. To address the issue, we further propose a technique named prior shifting to adjust the prior distribution using curiosity-driven RL. The outcome distribution is demonstrated to offer sufficient behavioral diversity and significantly facilitates upper-level policy learning for downstream tasks. We conduct comprehensive experiments using humanoid characters on two challenging downstream tasks, sword-shield striking and two-player boxing game. Our results demonstrate that the proposed framework is capable of controlling the character to perform considerably high-quality movements in terms of behavioral strategies, diversity, and realism. Videos, codes, and data are available at https://tencent-roboticsx.github.io/NCP/.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Graphics (TOG)

自引率

0.00%

发文量