动态工业环境下的多智能体强化学习

2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC) Pub Date : 2023-06-01 DOI:10.1109/COMPSAC57700.2023.00066

Hongyi Zhang, Jingya Li, Z. Qi, Anders Aronsson, Jan Bosch, H. H. Olsson

{"title":"动态工业环境下的多智能体强化学习","authors":"Hongyi Zhang, Jingya Li, Z. Qi, Anders Aronsson, Jan Bosch, H. H. Olsson","doi":"10.1109/COMPSAC57700.2023.00066","DOIUrl":null,"url":null,"abstract":"Deep reinforcement learning has advanced signifi-cantly in recent years, and it is now used in embedded systems in addition to simulators and games. Reinforcement Learning (RL) algorithms are currently being used to enhance device operation so that they can learn on their own and offer clients better services. It has recently been studied in a variety of industrial applications. However, reinforcement learning, especially when controlling a large number of agents in an industrial environment, has been demonstrated to be unstable and unable to adapt to realistic situations when used in a real-world setting. To address this problem, the goal of this study is to enable multiple reinforcement learning agents to independently learn control policies on their own in dynamic industrial contexts. In order to solve the problem, we propose a dynamic multi-agent reinforcement learning (dynamic multi-RL) method along with adaptive exploration (AE) and vector-based action selection (VAS) techniques for accelerating model convergence and adapting to a complex industrial environment. The proposed algorithm is tested for validation in emergency situations within the telecommunications industry. In such circumstances, three unmanned aerial vehicles (UAV-BSs) are used to provide temporary coverage to mission-critical (MC) customers in disaster zones when the original serving base station (BS) is destroyed by natural disasters. The algorithm directs the participating agents automatically to enhance service quality. Our findings demonstrate that the proposed dynamic multi-RL algorithm can proficiently manage the learning of multiple agents and adjust to dynamic industrial environments. Additionally, it enhances learning speed and improves the quality of service.","PeriodicalId":296288,"journal":{"name":"2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Agent Reinforcement Learning in Dynamic Industrial Context\",\"authors\":\"Hongyi Zhang, Jingya Li, Z. Qi, Anders Aronsson, Jan Bosch, H. H. Olsson\",\"doi\":\"10.1109/COMPSAC57700.2023.00066\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep reinforcement learning has advanced signifi-cantly in recent years, and it is now used in embedded systems in addition to simulators and games. Reinforcement Learning (RL) algorithms are currently being used to enhance device operation so that they can learn on their own and offer clients better services. It has recently been studied in a variety of industrial applications. However, reinforcement learning, especially when controlling a large number of agents in an industrial environment, has been demonstrated to be unstable and unable to adapt to realistic situations when used in a real-world setting. To address this problem, the goal of this study is to enable multiple reinforcement learning agents to independently learn control policies on their own in dynamic industrial contexts. In order to solve the problem, we propose a dynamic multi-agent reinforcement learning (dynamic multi-RL) method along with adaptive exploration (AE) and vector-based action selection (VAS) techniques for accelerating model convergence and adapting to a complex industrial environment. The proposed algorithm is tested for validation in emergency situations within the telecommunications industry. In such circumstances, three unmanned aerial vehicles (UAV-BSs) are used to provide temporary coverage to mission-critical (MC) customers in disaster zones when the original serving base station (BS) is destroyed by natural disasters. The algorithm directs the participating agents automatically to enhance service quality. Our findings demonstrate that the proposed dynamic multi-RL algorithm can proficiently manage the learning of multiple agents and adjust to dynamic industrial environments. Additionally, it enhances learning speed and improves the quality of service.\",\"PeriodicalId\":296288,\"journal\":{\"name\":\"2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COMPSAC57700.2023.00066\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMPSAC57700.2023.00066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

近年来，深度强化学习取得了显著进展，除了模拟器和游戏之外，它现在还用于嵌入式系统。强化学习(RL)算法目前被用于增强设备操作，使其能够自主学习并为客户提供更好的服务。它最近在各种工业应用中得到了研究。然而，强化学习，特别是在工业环境中控制大量智能体时，已被证明是不稳定的，并且在现实环境中使用时无法适应现实情况。为了解决这个问题，本研究的目标是使多个强化学习代理能够在动态工业环境中独立学习自己的控制策略。为了解决这个问题，我们提出了一种动态多智能体强化学习(dynamic multi-RL)方法，以及自适应探索(AE)和基于向量的动作选择(VAS)技术，以加速模型收敛并适应复杂的工业环境。该算法在电信行业的紧急情况下进行了验证测试。在这种情况下，当原始服务基站(BS)被自然灾害摧毁时，使用三架无人机(UAV-BSs)为灾区的关键任务(MC)客户提供临时覆盖。该算法自动引导参与的座席提高服务质量。我们的研究结果表明，所提出的动态多强化学习算法可以熟练地管理多个智能体的学习，并适应动态的工业环境。此外，它提高了学习速度，提高了服务质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-Agent Reinforcement Learning in Dynamic Industrial Context

Deep reinforcement learning has advanced signifi-cantly in recent years, and it is now used in embedded systems in addition to simulators and games. Reinforcement Learning (RL) algorithms are currently being used to enhance device operation so that they can learn on their own and offer clients better services. It has recently been studied in a variety of industrial applications. However, reinforcement learning, especially when controlling a large number of agents in an industrial environment, has been demonstrated to be unstable and unable to adapt to realistic situations when used in a real-world setting. To address this problem, the goal of this study is to enable multiple reinforcement learning agents to independently learn control policies on their own in dynamic industrial contexts. In order to solve the problem, we propose a dynamic multi-agent reinforcement learning (dynamic multi-RL) method along with adaptive exploration (AE) and vector-based action selection (VAS) techniques for accelerating model convergence and adapting to a complex industrial environment. The proposed algorithm is tested for validation in emergency situations within the telecommunications industry. In such circumstances, three unmanned aerial vehicles (UAV-BSs) are used to provide temporary coverage to mission-critical (MC) customers in disaster zones when the original serving base station (BS) is destroyed by natural disasters. The algorithm directs the participating agents automatically to enhance service quality. Our findings demonstrate that the proposed dynamic multi-RL algorithm can proficiently manage the learning of multiple agents and adjust to dynamic industrial environments. Additionally, it enhances learning speed and improves the quality of service.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC)

自引率

0.00%

发文量