{"title":"连续状态与控制问题的近似动态规划","authors":"J. Si, Lei Yang, Chao Lu, Jian Sun, S. Mei","doi":"10.1109/MED.2009.5164745","DOIUrl":null,"url":null,"abstract":"Dynamic programming (DP) is an approach to computing the optimal control policy over time under nonlinearity and uncertainty by employing the principle of optimality introduced by Richard Bellman. Instead of enumerating all possible control sequences, dynamic programming only searches admissible state and/or action values that satisfy the principle of optimality. Therefore, the computation complexity can be much improved over the direct enumeration method. However, the computational efforts and the data storage requirement increase exponentially with the dimensionality of the system, which are reflected in the three curses: the state space, the observation space, and the action space. Thus, the traditional DP approach was limited to solving small size problems. This paper aims at providing an overview of latest development of a class of approximate/adaptive dynamic programming algorithms including those applicable to continuous state and continuous control problems. The paper will especially review direct heuristic dynamic programming (direct (HDP), its design and applications, which include large and complex continuous state and control problems. In addition to the basic principle of direct HDP, the paper includes two application studies of the direct HDP - one is when it is used in a nonlinear tracking problem, and the other is on a power grid coordination control problem based on China southern network.","PeriodicalId":422386,"journal":{"name":"2009 17th Mediterranean Conference on Control and Automation","volume":"133 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Approximate dynamic programming for continuous state and control problems\",\"authors\":\"J. Si, Lei Yang, Chao Lu, Jian Sun, S. Mei\",\"doi\":\"10.1109/MED.2009.5164745\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dynamic programming (DP) is an approach to computing the optimal control policy over time under nonlinearity and uncertainty by employing the principle of optimality introduced by Richard Bellman. Instead of enumerating all possible control sequences, dynamic programming only searches admissible state and/or action values that satisfy the principle of optimality. Therefore, the computation complexity can be much improved over the direct enumeration method. However, the computational efforts and the data storage requirement increase exponentially with the dimensionality of the system, which are reflected in the three curses: the state space, the observation space, and the action space. Thus, the traditional DP approach was limited to solving small size problems. This paper aims at providing an overview of latest development of a class of approximate/adaptive dynamic programming algorithms including those applicable to continuous state and continuous control problems. The paper will especially review direct heuristic dynamic programming (direct (HDP), its design and applications, which include large and complex continuous state and control problems. In addition to the basic principle of direct HDP, the paper includes two application studies of the direct HDP - one is when it is used in a nonlinear tracking problem, and the other is on a power grid coordination control problem based on China southern network.\",\"PeriodicalId\":422386,\"journal\":{\"name\":\"2009 17th Mediterranean Conference on Control and Automation\",\"volume\":\"133 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-06-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 17th Mediterranean Conference on Control and Automation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MED.2009.5164745\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 17th Mediterranean Conference on Control and Automation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MED.2009.5164745","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Approximate dynamic programming for continuous state and control problems
Dynamic programming (DP) is an approach to computing the optimal control policy over time under nonlinearity and uncertainty by employing the principle of optimality introduced by Richard Bellman. Instead of enumerating all possible control sequences, dynamic programming only searches admissible state and/or action values that satisfy the principle of optimality. Therefore, the computation complexity can be much improved over the direct enumeration method. However, the computational efforts and the data storage requirement increase exponentially with the dimensionality of the system, which are reflected in the three curses: the state space, the observation space, and the action space. Thus, the traditional DP approach was limited to solving small size problems. This paper aims at providing an overview of latest development of a class of approximate/adaptive dynamic programming algorithms including those applicable to continuous state and continuous control problems. The paper will especially review direct heuristic dynamic programming (direct (HDP), its design and applications, which include large and complex continuous state and control problems. In addition to the basic principle of direct HDP, the paper includes two application studies of the direct HDP - one is when it is used in a nonlinear tracking problem, and the other is on a power grid coordination control problem based on China southern network.