Faster MIL-based Subgoal Identification for Reinforcement Learning by Tuning Fewer Hyperparameters

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Autonomous and Adaptive Systems Pub Date : 2024-02-05 DOI:10.1145/3643852

Saim Sunel, Erkin Çilden, Faruk Polat

{"title":"Faster MIL-based Subgoal Identification for Reinforcement Learning by Tuning Fewer Hyperparameters","authors":"Saim Sunel, Erkin Çilden, Faruk Polat","doi":"10.1145/3643852","DOIUrl":null,"url":null,"abstract":"<p>Various methods have been proposed in the literature for identifying subgoals in discrete reinforcement learning (RL) tasks. Once subgoals are discovered, task decomposition methods can be employed to improve the learning performance of agents. In this study, we classify prominent subgoal identification methods for discrete RL tasks in the literature into the following three categories: graph-based, statistics-based, and multi-instance learning (MIL)-based. As contributions, firstly, we introduce a new MIL-based subgoal identification algorithm called EMDD-RL and experimentally compare it with a previous MIL-based method. The previous approach adapts MIL’s Diverse Density (DD) algorithm, whereas our method considers Expected-Maximization Diverse Density (EMDD). The advantage of EMDD over DD is that it can yield more accurate results with less computation demand thanks to the expectation-maximization algorithm. EMDD-RL modifies some of the algorithmic steps of EMDD to identify subgoals in discrete RL problems. Secondly, we evaluate the methods in several RL tasks for the hyperparameter tuning overhead they incur. Thirdly, we propose a new RL problem called key-room and compare the methods for their subgoal identification performances in this new task. Experiment results show that MIL-based subgoal identification methods could be preferred to the algorithms of the other two categories in practice.</p>","PeriodicalId":50919,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems","volume":"91 1","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Autonomous and Adaptive Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3643852","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Various methods have been proposed in the literature for identifying subgoals in discrete reinforcement learning (RL) tasks. Once subgoals are discovered, task decomposition methods can be employed to improve the learning performance of agents. In this study, we classify prominent subgoal identification methods for discrete RL tasks in the literature into the following three categories: graph-based, statistics-based, and multi-instance learning (MIL)-based. As contributions, firstly, we introduce a new MIL-based subgoal identification algorithm called EMDD-RL and experimentally compare it with a previous MIL-based method. The previous approach adapts MIL’s Diverse Density (DD) algorithm, whereas our method considers Expected-Maximization Diverse Density (EMDD). The advantage of EMDD over DD is that it can yield more accurate results with less computation demand thanks to the expectation-maximization algorithm. EMDD-RL modifies some of the algorithmic steps of EMDD to identify subgoals in discrete RL problems. Secondly, we evaluate the methods in several RL tasks for the hyperparameter tuning overhead they incur. Thirdly, we propose a new RL problem called key-room and compare the methods for their subgoal identification performances in this new task. Experiment results show that MIL-based subgoal identification methods could be preferred to the algorithms of the other two categories in practice.

查看原文本刊更多论文

通过调整更少的超参数，更快地识别基于 MIL 的强化学习子目标

文献中提出了各种方法来识别离散强化学习（RL）任务中的子目标。一旦发现了子目标，就可以采用任务分解方法来提高代理的学习性能。在本研究中，我们将文献中著名的离散强化学习任务子目标识别方法分为以下三类：基于图的方法、基于统计的方法和基于多实例学习（MIL）的方法。作为贡献，我们首先介绍了一种新的基于 MIL 的子目标识别算法 EMDD-RL，并将其与之前的一种基于 MIL 的方法进行了实验比较。之前的方法采用了 MIL 的多样性密度 (DD) 算法，而我们的方法则考虑了期望最大化多样性密度 (EMDD)。与 DD 相比，EMDD 的优势在于，由于采用了期望最大化算法，它能以更少的计算需求获得更准确的结果。EMDD-RL 修改了 EMDD 的部分算法步骤，以识别离散 RL 问题中的子目标。其次，我们在多个 RL 任务中评估了这些方法的超参数调整开销。第三，我们提出了一个名为 "key-room "的新 RL 问题，并比较了这些方法在这个新任务中的子目标识别性能。实验结果表明，在实际应用中，基于 MIL 的子目标识别方法优于其他两类算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Autonomous and Adaptive Systems 工程技术-计算机：理论方法

CiteScore

4.80

自引率

7.40%

发文量

审稿时长

>12 weeks

期刊介绍： TAAS addresses research on autonomous and adaptive systems being undertaken by an increasingly interdisciplinary research community -- and provides a common platform under which this work can be published and disseminated. TAAS encourages contributions aimed at supporting the understanding, development, and control of such systems and of their behaviors. TAAS addresses research on autonomous and adaptive systems being undertaken by an increasingly interdisciplinary research community - and provides a common platform under which this work can be published and disseminated. TAAS encourages contributions aimed at supporting the understanding, development, and control of such systems and of their behaviors. Contributions are expected to be based on sound and innovative theoretical models, algorithms, engineering and programming techniques, infrastructures and systems, or technological and application experiences.