K. Czechowski, Piotr Januszewski, Piotr Kozakowski, Łukasz Kuciński, Piotr Milos
{"title":"Structure and Randomness in Planning and Reinforcement Learning","authors":"K. Czechowski, Piotr Januszewski, Piotr Kozakowski, Łukasz Kuciński, Piotr Milos","doi":"10.1109/IJCNN52387.2021.9533317","DOIUrl":null,"url":null,"abstract":"Planning in large state spaces inevitably needs to balance the depth and breadth of the search. It has a crucial impact on the performance of a planner and most manage this interplay implicitly. We present a novel method Shoot Tree Search (STS), which makes it possible to control this trade-off more explicitly. Our algorithm can be understood as an interpolation between two celebrated search mechanisms: MCTS and random shooting. It also lets the user control the bias-variance trade-off, akin to TD(n), but in the tree search context. In experiments on challenging domains, we show that STS can get the best of both worlds consistently achieving higher scores.","PeriodicalId":396583,"journal":{"name":"2021 International Joint Conference on Neural Networks (IJCNN)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN52387.2021.9533317","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Planning in large state spaces inevitably needs to balance the depth and breadth of the search. It has a crucial impact on the performance of a planner and most manage this interplay implicitly. We present a novel method Shoot Tree Search (STS), which makes it possible to control this trade-off more explicitly. Our algorithm can be understood as an interpolation between two celebrated search mechanisms: MCTS and random shooting. It also lets the user control the bias-variance trade-off, akin to TD(n), but in the tree search context. In experiments on challenging domains, we show that STS can get the best of both worlds consistently achieving higher scores.
在大型状态空间中进行规划不可避免地需要平衡搜索的深度和广度。它对计划者的表现有着至关重要的影响,而且大多数人都在暗中管理这种相互作用。我们提出了一种新的方法Shoot Tree Search (STS),使得更明确地控制这种权衡成为可能。我们的算法可以理解为两种著名的搜索机制之间的插值:MCTS和随机射击。它还允许用户控制偏差-方差权衡,类似于TD(n),但在树搜索上下文中。在挑战性领域的实验中,我们表明STS可以两全其美地获得更高的分数。