Prediction-Based Multi-Agent Reinforcement Learning in Inherently Non-Stationary Environments

Andrei Marinescu, Ivana Dusparic, S. Clarke
{"title":"Prediction-Based Multi-Agent Reinforcement Learning in Inherently Non-Stationary Environments","authors":"Andrei Marinescu, Ivana Dusparic, S. Clarke","doi":"10.1145/3070861","DOIUrl":null,"url":null,"abstract":"Multi-agent reinforcement learning (MARL) is a widely researched technique for decentralised control in complex large-scale autonomous systems. Such systems often operate in environments that are continuously evolving and where agents’ actions are non-deterministic, so called inherently non-stationary environments. When there are inconsistent results for agents acting on such an environment, learning and adapting is challenging. In this article, we propose P-MARL, an approach that integrates prediction and pattern change detection abilities into MARL and thus minimises the effect of non-stationarity in the environment. The environment is modelled as a time-series, with future estimates provided using prediction techniques. Learning is based on the predicted environment behaviour, with agents employing this knowledge to improve their performance in realtime. We illustrate P-MARL’s performance in a real-world smart grid scenario, where the environment is heavily influenced by non-stationary power demand patterns from residential consumers. We evaluate P-MARL in three different situations, where agents’ action decisions are independent, simultaneous, and sequential. Results show that all methods outperform traditional MARL, with sequential P-MARL achieving best results.","PeriodicalId":377078,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems (TAAS)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"51","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Autonomous and Adaptive Systems (TAAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3070861","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 51

Abstract

Multi-agent reinforcement learning (MARL) is a widely researched technique for decentralised control in complex large-scale autonomous systems. Such systems often operate in environments that are continuously evolving and where agents’ actions are non-deterministic, so called inherently non-stationary environments. When there are inconsistent results for agents acting on such an environment, learning and adapting is challenging. In this article, we propose P-MARL, an approach that integrates prediction and pattern change detection abilities into MARL and thus minimises the effect of non-stationarity in the environment. The environment is modelled as a time-series, with future estimates provided using prediction techniques. Learning is based on the predicted environment behaviour, with agents employing this knowledge to improve their performance in realtime. We illustrate P-MARL’s performance in a real-world smart grid scenario, where the environment is heavily influenced by non-stationary power demand patterns from residential consumers. We evaluate P-MARL in three different situations, where agents’ action decisions are independent, simultaneous, and sequential. Results show that all methods outperform traditional MARL, with sequential P-MARL achieving best results.
固有非平稳环境下基于预测的多智能体强化学习
多智能体强化学习(MARL)是一种被广泛研究的用于复杂大型自治系统分散控制的技术。这样的系统通常在不断发展的环境中运行,其中代理的行为是不确定的,即所谓的固有非静止环境。当代理在这样的环境中产生不一致的结果时,学习和适应是具有挑战性的。在本文中,我们提出了P-MARL,这是一种将预测和模式变化检测能力集成到MARL中的方法,从而最大限度地减少了环境中非平稳性的影响。将环境建模为时间序列,并使用预测技术提供未来的估计。学习是基于预测的环境行为,代理利用这些知识来实时提高他们的表现。我们在现实世界的智能电网场景中演示了P-MARL的性能,其中环境受到住宅消费者非平稳电力需求模式的严重影响。我们在三种不同的情况下评估P-MARL,其中代理的行动决策是独立的,同时的和顺序的。结果表明,所有方法都优于传统的MARL,其中顺序P-MARL的效果最好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信