强化学习中状态空间适应的建构主义方法

2019 IEEE 13th International Conference on Self-Adaptive and Self-Organizing Systems (SASO) Pub Date : 2019-06-16 DOI:10.1109/SASO.2019.00016

Maxime Guériau, Nicolás Cardozo, Ivana Dusparic

{"title":"强化学习中状态空间适应的建构主义方法","authors":"Maxime Guériau, Nicolás Cardozo, Ivana Dusparic","doi":"10.1109/SASO.2019.00016","DOIUrl":null,"url":null,"abstract":"Reinforcement learning (RL) is increasingly used to achieve adaptive behaviours in Internet of Things systems relying on large amounts of sensor data. To address the need for self-adaptation in such environments, techniques for detecting environment changes and re-learning behaviours appropriate to those changes have been proposed. However, with the heterogeneity of sensor inputs, the problem of self-adaptation permeates one level deeper; in order for the learnt behaviour to adapt, the underlying environment representation needs to adapt first. The granularity of the RL state space might need to be adapted to learn more efficiently, or to match the new granularity of input data. This paper proposes an implementation of Constructivist RL (Con-RL), enabling RL to learn and continuously adapt its state space representations. We propose a Multi-Layer Growing Neural Gas (ML-GNG) technique, as an extension of the GNG clustering algorithm, to autonomously learn suitable state spaces based on sensor data and learnt actions at runtime. We also create and continuously update a repository of state spaces, selecting the most appropriate one to use at each time step. We evaluate Con-RL in two scenarios: the canonical RL mountain car single-agent scenario, and a large-scale multi-agent car and ride-sharing scenario. We demonstrate its ability to adapt to new sensor inputs, to increase the speed of learning through state space optimization, and to maintain stable long-term performance.","PeriodicalId":259990,"journal":{"name":"2019 IEEE 13th International Conference on Self-Adaptive and Self-Organizing Systems (SASO)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Constructivist Approach to State Space Adaptation in Reinforcement Learning\",\"authors\":\"Maxime Guériau, Nicolás Cardozo, Ivana Dusparic\",\"doi\":\"10.1109/SASO.2019.00016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement learning (RL) is increasingly used to achieve adaptive behaviours in Internet of Things systems relying on large amounts of sensor data. To address the need for self-adaptation in such environments, techniques for detecting environment changes and re-learning behaviours appropriate to those changes have been proposed. However, with the heterogeneity of sensor inputs, the problem of self-adaptation permeates one level deeper; in order for the learnt behaviour to adapt, the underlying environment representation needs to adapt first. The granularity of the RL state space might need to be adapted to learn more efficiently, or to match the new granularity of input data. This paper proposes an implementation of Constructivist RL (Con-RL), enabling RL to learn and continuously adapt its state space representations. We propose a Multi-Layer Growing Neural Gas (ML-GNG) technique, as an extension of the GNG clustering algorithm, to autonomously learn suitable state spaces based on sensor data and learnt actions at runtime. We also create and continuously update a repository of state spaces, selecting the most appropriate one to use at each time step. We evaluate Con-RL in two scenarios: the canonical RL mountain car single-agent scenario, and a large-scale multi-agent car and ride-sharing scenario. We demonstrate its ability to adapt to new sensor inputs, to increase the speed of learning through state space optimization, and to maintain stable long-term performance.\",\"PeriodicalId\":259990,\"journal\":{\"name\":\"2019 IEEE 13th International Conference on Self-Adaptive and Self-Organizing Systems (SASO)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 13th International Conference on Self-Adaptive and Self-Organizing Systems (SASO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SASO.2019.00016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 13th International Conference on Self-Adaptive and Self-Organizing Systems (SASO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SASO.2019.00016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

在依赖大量传感器数据的物联网系统中，强化学习(RL)越来越多地用于实现自适应行为。为了解决在这种环境中自我适应的需要，已经提出了检测环境变化和重新学习适合这些变化的行为的技术。然而，由于传感器输入的异质性，自适应问题深入到一个层次;为了使学习到的行为适应，底层环境表征需要首先适应。RL状态空间的粒度可能需要调整，以便更有效地学习，或者匹配输入数据的新粒度。本文提出了一种建构主义强化学习(Con-RL)的实现方法，使强化学习能够学习并持续适应其状态空间表征。我们提出了一种多层生长神经气体(ML-GNG)技术，作为GNG聚类算法的扩展，基于传感器数据和在运行时学习的动作自主学习合适的状态空间。我们还创建并不断更新状态空间存储库，在每个时间步骤选择最合适的状态空间来使用。我们在两种场景下评估了控制强化学习:典型的RL山地车单智能体场景，以及大规模的多智能体汽车和拼车场景。我们证明了它能够适应新的传感器输入，通过状态空间优化提高学习速度，并保持稳定的长期性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Constructivist Approach to State Space Adaptation in Reinforcement Learning

Reinforcement learning (RL) is increasingly used to achieve adaptive behaviours in Internet of Things systems relying on large amounts of sensor data. To address the need for self-adaptation in such environments, techniques for detecting environment changes and re-learning behaviours appropriate to those changes have been proposed. However, with the heterogeneity of sensor inputs, the problem of self-adaptation permeates one level deeper; in order for the learnt behaviour to adapt, the underlying environment representation needs to adapt first. The granularity of the RL state space might need to be adapted to learn more efficiently, or to match the new granularity of input data. This paper proposes an implementation of Constructivist RL (Con-RL), enabling RL to learn and continuously adapt its state space representations. We propose a Multi-Layer Growing Neural Gas (ML-GNG) technique, as an extension of the GNG clustering algorithm, to autonomously learn suitable state spaces based on sensor data and learnt actions at runtime. We also create and continuously update a repository of state spaces, selecting the most appropriate one to use at each time step. We evaluate Con-RL in two scenarios: the canonical RL mountain car single-agent scenario, and a large-scale multi-agent car and ride-sharing scenario. We demonstrate its ability to adapt to new sensor inputs, to increase the speed of learning through state space optimization, and to maintain stable long-term performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE 13th International Conference on Self-Adaptive and Self-Organizing Systems (SASO)

自引率

0.00%

发文量