Neuro-dynamic programming based on self-organized patterns

Proceedings of the 1999 IEEE International Symposium on Intelligent Control Intelligent Systems and Semiotics (Cat. No.99CH37014) Pub Date : 1900-01-01 DOI:10.1109/ISIC.1999.796641

J. Si, Y.-T. Wang

引用次数: 1

Abstract

This paper introduces a real-time learning control mechanism, as a robust and efficient scheme of neuro-dynamic programming. The objective of the learning controller is to optimize a certain performance measure by learning to create appropriate control actions through interacting with the environment. The controller is set out to learn to perform better over time starting with no prior knowledge about the system. The system under consideration does not render a complete system model describing its behaviors. Instead, real-time sampled measurements are available to the designer. The state measurements are first analyzed by similarity and organized by proximity. Control actions are then generated in relevance to the state patterns. A critic network serves the purpose of 'monitoring' the performance of the controller to achieve a given optimality. We provide detailed implementation, and performance evaluations of this learning controller in a cart-pole balancing problem.

查看原文本刊更多论文

基于自组织模式的神经动态规划

本文介绍了一种实时学习控制机制，作为一种鲁棒、高效的神经动态规划方案。学习控制器的目标是通过与环境的相互作用来学习创建适当的控制动作，从而优化某一性能度量。控制器的设定是在没有系统先验知识的情况下，随着时间的推移学习更好的性能。所考虑的系统没有给出描述其行为的完整系统模型。相反，实时采样测量可供设计人员使用。首先通过相似性分析状态测量，然后根据接近度组织状态测量。然后生成与状态模式相关的控制动作。批评家网络的作用是“监视”控制器的性能以达到给定的最优性。我们提供了该学习控制器在推车杆平衡问题中的详细实现和性能评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 1999 IEEE International Symposium on Intelligent Control Intelligent Systems and Semiotics (Cat. No.99CH37014)

自引率

0.00%

发文量