利用混沌提高动态规划方法的速度和效率

Journal of Artificial Intelligence and Data Mining Pub Date : 2021-09-01 DOI:10.22044/JADM.2021.10520.2191

H. Khodadadi, V. Derhami

{"title":"利用混沌提高动态规划方法的速度和效率","authors":"H. Khodadadi, V. Derhami","doi":"10.22044/JADM.2021.10520.2191","DOIUrl":null,"url":null,"abstract":"A prominent weakness of dynamic programming methods is that they perform operations throughout the entire set of states in a Markov decision process in every updating phase. This paper proposes a novel chaos-based method to solve the problem. For this purpose, a chaotic system is first initialized, and the resultant numbers are mapped onto the environment states through initial processing. In each traverse of the policy iteration method, policy evaluation is performed only once, and only a few states are updated. These states are proposed by the chaos system. In this method, the policy evaluation and improvement cycle lasts until an optimal policy is formulated in the environment. The same procedure is performed in the value iteration method, and only the values of a few states proposed by the chaos are updated in each traverse, whereas the values of other states are left unchanged. Unlike the conventional methods, an optimal solution can be obtained in the proposed method by only updating a limited number of states which are properly distributed all over the environment by chaos. The test results indicate the improved speed and efficiency of chaotic dynamic programming methods in obtaining the optimal solution in different grid environments.","PeriodicalId":32592,"journal":{"name":"Journal of Artificial Intelligence and Data Mining","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improving Speed and Efficiency of Dynamic Programming Methods through Chaos\",\"authors\":\"H. Khodadadi, V. Derhami\",\"doi\":\"10.22044/JADM.2021.10520.2191\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A prominent weakness of dynamic programming methods is that they perform operations throughout the entire set of states in a Markov decision process in every updating phase. This paper proposes a novel chaos-based method to solve the problem. For this purpose, a chaotic system is first initialized, and the resultant numbers are mapped onto the environment states through initial processing. In each traverse of the policy iteration method, policy evaluation is performed only once, and only a few states are updated. These states are proposed by the chaos system. In this method, the policy evaluation and improvement cycle lasts until an optimal policy is formulated in the environment. The same procedure is performed in the value iteration method, and only the values of a few states proposed by the chaos are updated in each traverse, whereas the values of other states are left unchanged. Unlike the conventional methods, an optimal solution can be obtained in the proposed method by only updating a limited number of states which are properly distributed all over the environment by chaos. The test results indicate the improved speed and efficiency of chaotic dynamic programming methods in obtaining the optimal solution in different grid environments.\",\"PeriodicalId\":32592,\"journal\":{\"name\":\"Journal of Artificial Intelligence and Data Mining\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Artificial Intelligence and Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22044/JADM.2021.10520.2191\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial Intelligence and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22044/JADM.2021.10520.2191","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

动态规划方法的一个突出弱点是，它们在每个更新阶段的马尔可夫决策过程中对整个状态集执行操作。本文提出了一种新的基于混沌的方法来解决这个问题。为此，首先对混沌系统进行初始化，并通过初始处理将所得数字映射到环境状态。在策略迭代方法的每次遍历中，策略评估只执行一次，并且只更新少数状态。这些状态是由混沌系统提出的。在这种方法中，策略评估和改进周期持续到在环境中制定最优策略为止。在值迭代方法中执行相同的过程，并且在每次遍历中只有混沌提出的少数状态的值被更新，而其他状态的值保持不变。与传统方法不同，该方法只需更新有限数量的状态即可获得最优解，这些状态通过混沌正确分布在整个环境中。测试结果表明，混沌动态规划方法在不同网格环境下获得最优解的速度和效率都有所提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving Speed and Efficiency of Dynamic Programming Methods through Chaos

A prominent weakness of dynamic programming methods is that they perform operations throughout the entire set of states in a Markov decision process in every updating phase. This paper proposes a novel chaos-based method to solve the problem. For this purpose, a chaotic system is first initialized, and the resultant numbers are mapped onto the environment states through initial processing. In each traverse of the policy iteration method, policy evaluation is performed only once, and only a few states are updated. These states are proposed by the chaos system. In this method, the policy evaluation and improvement cycle lasts until an optimal policy is formulated in the environment. The same procedure is performed in the value iteration method, and only the values of a few states proposed by the chaos are updated in each traverse, whereas the values of other states are left unchanged. Unlike the conventional methods, an optimal solution can be obtained in the proposed method by only updating a limited number of states which are properly distributed all over the environment by chaos. The test results indicate the improved speed and efficiency of chaotic dynamic programming methods in obtaining the optimal solution in different grid environments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Artificial Intelligence and Data Mining

自引率

0.00%

发文量

审稿时长

8 weeks