PFPN:使用粒子滤波策略网络的物理模拟字符的连续控制

Proceedings of the 14th ACM SIGGRAPH Conference on Motion, Interaction and Games Pub Date : 2020-03-16 DOI:10.1145/3487983.3488301

Pei Xu, Ioannis Karamouzas

{"title":"PFPN:使用粒子滤波策略网络的物理模拟字符的连续控制","authors":"Pei Xu, Ioannis Karamouzas","doi":"10.1145/3487983.3488301","DOIUrl":null,"url":null,"abstract":"Data-driven methods for physics-based character control using reinforcement learning have been successfully applied to generate high-quality motions. However, existing approaches typically rely on Gaussian distributions to represent the action policy, which can prematurely commit to suboptimal actions when solving high-dimensional continuous control problems for highly-articulated characters. In this paper, to improve the learning performance of physics-based character controllers, we propose a framework that considers a particle-based action policy as a substitute for Gaussian policies. We exploit particle filtering to dynamically explore and discretize the action space, and track the posterior policy represented as a mixture distribution. The resulting policy can replace the unimodal Gaussian policy which has been the staple for character control problems, without changing the underlying model architecture of the reinforcement learning algorithm used to perform policy optimization. We demonstrate the applicability of our approach on various motion capture imitation tasks. Baselines using our particle-based policies achieve better imitation performance and speed of convergence as compared to corresponding implementations using Gaussians, and are more robust to external perturbations during character control. Related code is available at: https://motion-lab.github.io/PFPN.","PeriodicalId":170509,"journal":{"name":"Proceedings of the 14th ACM SIGGRAPH Conference on Motion, Interaction and Games","volume":"1 10","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"PFPN: Continuous Control of Physically Simulated Characters using Particle Filtering Policy Network\",\"authors\":\"Pei Xu, Ioannis Karamouzas\",\"doi\":\"10.1145/3487983.3488301\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data-driven methods for physics-based character control using reinforcement learning have been successfully applied to generate high-quality motions. However, existing approaches typically rely on Gaussian distributions to represent the action policy, which can prematurely commit to suboptimal actions when solving high-dimensional continuous control problems for highly-articulated characters. In this paper, to improve the learning performance of physics-based character controllers, we propose a framework that considers a particle-based action policy as a substitute for Gaussian policies. We exploit particle filtering to dynamically explore and discretize the action space, and track the posterior policy represented as a mixture distribution. The resulting policy can replace the unimodal Gaussian policy which has been the staple for character control problems, without changing the underlying model architecture of the reinforcement learning algorithm used to perform policy optimization. We demonstrate the applicability of our approach on various motion capture imitation tasks. Baselines using our particle-based policies achieve better imitation performance and speed of convergence as compared to corresponding implementations using Gaussians, and are more robust to external perturbations during character control. Related code is available at: https://motion-lab.github.io/PFPN.\",\"PeriodicalId\":170509,\"journal\":{\"name\":\"Proceedings of the 14th ACM SIGGRAPH Conference on Motion, Interaction and Games\",\"volume\":\"1 10\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-03-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 14th ACM SIGGRAPH Conference on Motion, Interaction and Games\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3487983.3488301\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 14th ACM SIGGRAPH Conference on Motion, Interaction and Games","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3487983.3488301","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

使用强化学习的基于物理的字符控制的数据驱动方法已成功应用于生成高质量的动作。然而，现有的方法通常依赖于高斯分布来表示动作策略，这在解决高度连接的角色的高维连续控制问题时可能过早地承诺次优动作。在本文中，为了提高基于物理的字符控制器的学习性能，我们提出了一个框架，该框架考虑基于粒子的动作策略作为高斯策略的替代品。我们利用粒子滤波来动态探索和离散动作空间，并跟踪后验策略，表示为混合分布。所得到的策略可以取代单峰高斯策略，单峰高斯策略一直是字符控制问题的主要策略，而不会改变用于执行策略优化的强化学习算法的底层模型架构。我们展示了我们的方法在各种动作捕捉模仿任务上的适用性。与使用高斯函数的相应实现相比，使用基于粒子的策略的基线实现了更好的模仿性能和收敛速度，并且在字符控制期间对外部扰动更具鲁棒性。相关代码可从https://motion-lab.github.io/PFPN获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

PFPN: Continuous Control of Physically Simulated Characters using Particle Filtering Policy Network

Data-driven methods for physics-based character control using reinforcement learning have been successfully applied to generate high-quality motions. However, existing approaches typically rely on Gaussian distributions to represent the action policy, which can prematurely commit to suboptimal actions when solving high-dimensional continuous control problems for highly-articulated characters. In this paper, to improve the learning performance of physics-based character controllers, we propose a framework that considers a particle-based action policy as a substitute for Gaussian policies. We exploit particle filtering to dynamically explore and discretize the action space, and track the posterior policy represented as a mixture distribution. The resulting policy can replace the unimodal Gaussian policy which has been the staple for character control problems, without changing the underlying model architecture of the reinforcement learning algorithm used to perform policy optimization. We demonstrate the applicability of our approach on various motion capture imitation tasks. Baselines using our particle-based policies achieve better imitation performance and speed of convergence as compared to corresponding implementations using Gaussians, and are more robust to external perturbations during character control. Related code is available at: https://motion-lab.github.io/PFPN.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 14th ACM SIGGRAPH Conference on Motion, Interaction and Games

自引率

0.00%

发文量