2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)最新文献_第3页

Reinforcement learning algorithms for solving classification problems 解决分类问题的强化学习算法

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) Pub Date : 2011-04-11 DOI: 10.1109/ADPRL.2011.5967372

M. Wiering, H. V. Hasselt, Auke-Dirk Pietersma, Lambert Schomaker

引用次数: 47

Global optimal strategies of a class of finite-horizon continuous-time nonaffine nonlinear zero-sum game using a new iteration algorithm 一类有限视界连续时间非仿射非线性零和博弈的全局最优策略

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) Pub Date : 2011-04-11 DOI: 10.1109/ADPRL.2011.5967360

Xin Zhang, Huaguang Zhang, Lili Cui, Yanhong Luo

引用次数: 3

Tree-based variable selection for dimensionality reduction of large-scale control systems 基于树的大型控制系统降维变量选择

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) Pub Date : 2011-04-11 DOI: 10.1109/ADPRL.2011.5967387

A. Castelletti, S. Galelli, Marcello Restelli, R. Soncini-Sessa

引用次数: 17

Higher order Q-Learning 高阶q学习

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) Pub Date : 2011-04-11 DOI: 10.1109/ADPRL.2011.5967385

Ashley D. Edwards, W. Pottenger

引用次数: 6

Active exploration by searching for experiments that falsify the computed control policy 主动探索，寻找伪造计算控制策略的实验

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) Pub Date : 2011-04-11 DOI: 10.1109/ADPRL.2011.5967364

R. Fonteneau, S. Murphy, L. Wehenkel, D. Ernst

引用次数: 2

Online adaptive learning of optimal control solutions using integral reinforcement learning 使用积分强化学习的最优控制解的在线自适应学习

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) Pub Date : 2011-04-11 DOI: 10.1109/ADPRL.2011.5967359

K. Vamvoudakis, D. Vrabie, F. Lewis

引用次数: 17

Application of reinforcement learning-based algorithms in CO2 allowance and electricity markets 基于强化学习的算法在二氧化碳配额和电力市场中的应用

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) Pub Date : 2011-04-11 DOI: 10.1109/ADPRL.2011.5967367

V. Nanduri

引用次数: 6

Model-building semi-Markov adaptive critics 模型构建半马尔可夫自适应批评

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) Pub Date : 2011-04-11 DOI: 10.1109/ADPRL.2011.5967374

A. Gosavi, S. Murray, Jiaqiao Hu

{"title":"Model-building semi-Markov adaptive critics","authors":"A. Gosavi, S. Murray, Jiaqiao Hu","doi":"10.1109/ADPRL.2011.5967374","DOIUrl":"https://doi.org/10.1109/ADPRL.2011.5967374","url":null,"abstract":"Adaptive or actor critics are a class of reinforcement learning (RL) or approximate dynamic programming (ADP) algorithms in which one searches over stochastic policies in order to determine the optimal deterministic policy. Classically, these algorithms have been studied for Markov decision processes (MDPs) in the context of model-free updates in which transition probabilities are avoided altogether. A model-free version for the semi-MDP (SMDP) for discounted reward in which the transition time of each transition can be a random variable was proposed in Gosavi [1]. In this paper, we propose a variant in which the transition probability model is built simultaneously with the value function and action-probability functions. While our new algorithm does not require the transition probabilities apriori, it generates them along with the estimation of the value function and the action-probability functions required in adaptive critics. Model-building and model-based versions of algorithms have numerous advantages in contrast to their model-free counterparts. In particular, they are more stable and may require less training. However the additional steps of building the model may require increased storage in the computer's memory. In addition to enumerating potential application areas for our algorithm, we will analyze the advantages and disadvantages of model building.","PeriodicalId":406195,"journal":{"name":"2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134134197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Feedback controller parameterizations for Reinforcement Learning 强化学习的反馈控制器参数化

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) Pub Date : 2011-04-11 DOI: 10.1109/ADPRL.2011.5967370

John W. Roberts, I. Manchester, Russ Tedrake

引用次数: 31

Information space receding horizon control 信息空间后退地平线控制

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) Pub Date : 2011-04-11 DOI: 10.1109/ADPRL.2011.5967362

S. Chakravorty, R. Erwin

引用次数: 24