{"title":"Improved stability criteria of ADP control for efficient context-aware decision support systems","authors":"Yury Sokolov, R. Kozma","doi":"10.1109/ICAWST.2013.6765406","DOIUrl":null,"url":null,"abstract":"This paper addresses the issue of stability of approximate dynamic programming (ADP) in various sequential decision making problems, including intelligent control. We employ an ADP control algorithm that iteratively improves an internal model of the external world in the autonomous system based on its continuous interaction with the environment. Through the incremental learning process, the system becomes aware of the consequences of its action into the world. We extend previous results on stability of the ADP control to the case of general multi-layer neural network approximators. We demonstrate the benefit of our results in the control of various systems, including the cart pole balancing problem. Our results show significantly improved learning and control performance as compared to the state-of-art.","PeriodicalId":68697,"journal":{"name":"炎黄地理","volume":"2 1","pages":"41-47"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"炎黄地理","FirstCategoryId":"1089","ListUrlMain":"https://doi.org/10.1109/ICAWST.2013.6765406","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
This paper addresses the issue of stability of approximate dynamic programming (ADP) in various sequential decision making problems, including intelligent control. We employ an ADP control algorithm that iteratively improves an internal model of the external world in the autonomous system based on its continuous interaction with the environment. Through the incremental learning process, the system becomes aware of the consequences of its action into the world. We extend previous results on stability of the ADP control to the case of general multi-layer neural network approximators. We demonstrate the benefit of our results in the control of various systems, including the cart pole balancing problem. Our results show significantly improved learning and control performance as compared to the state-of-art.