Tuning the Hyperparameters of Anytime Planning: A Metareasoning Approach with Deep Reinforcement Learning

International Conference on Automated Planning and Scheduling Pub Date : 2022-06-13 DOI:10.1609/icaps.v32i1.19842

Abhinav Bhatia, Justin Svegliato, Samer B. Nashed, S. Zilberstein

引用次数: 4

Abstract

Anytime planning algorithms often have hyperparameters that can be tuned at runtime to optimize their performance. While work on metareasoning has focused on when to interrupt an anytime planner and act on the current plan, the scope of metareasoning can be expanded to tuning the hyperparameters of the anytime planner at runtime. This paper introduces a general, decision-theoretic metareasoning approach that optimizes both the stopping point and hyperparameters of anytime planning. We begin by proposing a generalization of the standard meta-level control problem for anytime algorithms. We then offer a meta-level control technique that monitors and controls an anytime algorithm using deep reinforcement learning. Finally, we show that our approach boosts performance on a common benchmark domain that uses anytime weighted A* to solve a range of heuristic search problems and a mobile robot application that uses RRT* to solve motion planning problems.

查看原文本刊更多论文

调优随时计划的超参数:一种基于深度强化学习的元推理方法

随时计划算法通常具有可在运行时调优以优化其性能的超参数。虽然元推理的工作主要关注于何时中断随时计划器并对当前计划采取行动，但元推理的范围可以扩展到在运行时调优随时计划器的超参数。本文介绍了一种通用的决策理论元推理方法，该方法可以同时优化任意时间规划的停车点和超参数。我们首先提出对任意时间算法的标准元级控制问题的一般化。然后，我们提供了一种元级控制技术，该技术使用深度强化学习来监视和控制任何时间算法。最后，我们展示了我们的方法在使用任意加权a *来解决一系列启发式搜索问题的通用基准域和使用RRT*来解决运动规划问题的移动机器人应用程序上提高了性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Conference on Automated Planning and Scheduling

自引率

0.00%

发文量