Discrete-time optimal control scheme based on Q-learning algorithm

2016 Seventh International Conference on Intelligent Control and Information Processing (ICICIP) Pub Date : 2016-12-01 DOI:10.1109/ICICIP.2016.7885888

Qinglai Wei, Derong Liu, Ruizhuo Song

引用次数: 1

Abstract

This paper is concerned with optimal control problems of discrete-time nonlinear systems via a novel Q-learning algorithm. In the newly developed Q-learning algorithm, the iterative Q function in each iteration is required to update on the whole state and control spaces, instead of being updated by a single state and control pair. A new convergence criterion of the corresponding Q-learning algorithm is presented, where the traditional constraints for the learning rates of Q-learning algorithms is relaxed. Finally, simulation results are provided to exemplify the good performance of the developed algorithm.

查看原文本刊更多论文

基于q -学习算法的离散时间最优控制方案

本文利用一种新的q -学习算法研究离散非线性系统的最优控制问题。在新开发的Q-learning算法中，每次迭代中的迭代Q函数需要在整个状态空间和控制空间上更新，而不是由单个状态和控制对更新。提出了相应的q -学习算法的一个新的收敛准则，放宽了传统的q -学习算法的学习率约束。最后给出了仿真结果，验证了该算法的良好性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 Seventh International Conference on Intelligent Control and Information Processing (ICICIP)

自引率

0.00%

发文量