Optimistic planning with long sequences of identical actions for near-optimal nonlinear control

2014 IEEE International Conference on Automation, Quality and Testing, Robotics Pub Date : 2014-05-22 DOI:10.1109/AQTR.2014.6857826

Koppány Máthé, L. Buşoniu, L. Miclea

{"title":"Optimistic planning with long sequences of identical actions for near-optimal nonlinear control","authors":"Koppány Máthé, L. Buşoniu, L. Miclea","doi":"10.1109/AQTR.2014.6857826","DOIUrl":null,"url":null,"abstract":"Optimistic planning for deterministic systems (OPD) is an algorithm able to find near-optimal control for very general, nonlinear systems. OPD iteratively builds near-optimal sequences of actions by always refining the most promising sequence; this is done by adding all possible one-step actions. However, OPD has large computational costs, which might be undesirable in real life applications. This paper proposes an adaptation of OPD for a specific subclass of control problems where control actions do not change often (e.g. bang-bang, time-optimal control). The new algorithm is called Optimistic Planning with K identical actions (OKP), and it refines sequences by adding, in addition to one-step actions, also repetitions of each action up to K times. Our analysis proves that the a posteriori performance guarantees are similar to those of OPD, improving with the length of the explored sequences, though the asymptotic behaviour of OKP cannot be formally predicted a priori. Simulations illustrate that for properly chosen parameter K, in a control problem from the class considered, OKP outperforms OPD.","PeriodicalId":297141,"journal":{"name":"2014 IEEE International Conference on Automation, Quality and Testing, Robotics","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Automation, Quality and Testing, Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AQTR.2014.6857826","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Optimistic planning for deterministic systems (OPD) is an algorithm able to find near-optimal control for very general, nonlinear systems. OPD iteratively builds near-optimal sequences of actions by always refining the most promising sequence; this is done by adding all possible one-step actions. However, OPD has large computational costs, which might be undesirable in real life applications. This paper proposes an adaptation of OPD for a specific subclass of control problems where control actions do not change often (e.g. bang-bang, time-optimal control). The new algorithm is called Optimistic Planning with K identical actions (OKP), and it refines sequences by adding, in addition to one-step actions, also repetitions of each action up to K times. Our analysis proves that the a posteriori performance guarantees are similar to those of OPD, improving with the length of the explored sequences, though the asymptotic behaviour of OKP cannot be formally predicted a priori. Simulations illustrate that for properly chosen parameter K, in a control problem from the class considered, OKP outperforms OPD.

查看原文本刊更多论文

近最优非线性控制的长序列相同动作的乐观规划

确定性系统的乐观规划(OPD)是一种能够为非常一般的非线性系统找到近最优控制的算法。OPD通过不断优化最有希望的序列，迭代地构建接近最优的动作序列;这是通过添加所有可能的单步操作来实现的。然而，OPD具有很大的计算成本，这在实际应用中可能是不可取的。本文提出了一种针对控制行为不经常变化的特定子类控制问题(如bang-bang，时间最优控制)的自适应OPD。这种新算法被称为具有K个相同动作的乐观规划(OKP)，它通过在一步动作之外增加每个动作最多K次的重复来优化序列。我们的分析证明了后验性能保证与OPD相似，随着探索序列的长度而改善，尽管OKP的渐近行为不能先验地正式预测。仿真表明，对于适当选择的参数K，在考虑的类的控制问题中，OKP优于OPD。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 IEEE International Conference on Automation, Quality and Testing, Robotics

自引率

0.00%

发文量