基于强化学习的多无人机优化控制，用于续航时间有限的通信服务

IF 4.9 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Cognitive and Developmental Systems Pub Date : 2024-08-12 DOI:10.1109/TCDS.2024.3441865

Lu Dong;Pinle Ding;Xin Yuan;Andi Xu;Jie Gui

{"title":"基于强化学习的多无人机优化控制，用于续航时间有限的通信服务","authors":"Lu Dong;Pinle Ding;Xin Yuan;Andi Xu;Jie Gui","doi":"10.1109/TCDS.2024.3441865","DOIUrl":null,"url":null,"abstract":"This article investigates the service path problem of multi-unmanned aerial vehicle (multi-UAV) providing communication services to multiuser in urban environments with limited endurance. Our goal is to learn an optimal multi-UAV centralized control policy that will enable UAVs to find the illumination areas in urban environments through curiosity-driven exploration and harvest energy to continue providing communication services to users. First, we propose a reinforcement learning (RL)-based multi-UAV centralized control strategy to maximize the accumulated communication service score. In the proposed framework, curiosity can act as an internal incentive signal, allowing UAVs to explore the environment without any prior knowledge. Second, a two-phase exploring protocol is proposed for practical implementation. Compared to the baseline method, our proposed method can achieve a significantly higher accumulated communication service score in the exploitation-intensive phase. The results demonstrate that the proposed method can obtain accurate service paths over the baseline method and handle the exploration-exploitation tradeoff well.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 1","pages":"219-231"},"PeriodicalIF":4.9000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement-Learning-Based Multi-Unmanned Aerial Vehicle Optimal Control for Communication Services With Limited Endurance\",\"authors\":\"Lu Dong;Pinle Ding;Xin Yuan;Andi Xu;Jie Gui\",\"doi\":\"10.1109/TCDS.2024.3441865\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article investigates the service path problem of multi-unmanned aerial vehicle (multi-UAV) providing communication services to multiuser in urban environments with limited endurance. Our goal is to learn an optimal multi-UAV centralized control policy that will enable UAVs to find the illumination areas in urban environments through curiosity-driven exploration and harvest energy to continue providing communication services to users. First, we propose a reinforcement learning (RL)-based multi-UAV centralized control strategy to maximize the accumulated communication service score. In the proposed framework, curiosity can act as an internal incentive signal, allowing UAVs to explore the environment without any prior knowledge. Second, a two-phase exploring protocol is proposed for practical implementation. Compared to the baseline method, our proposed method can achieve a significantly higher accumulated communication service score in the exploitation-intensive phase. The results demonstrate that the proposed method can obtain accurate service paths over the baseline method and handle the exploration-exploitation tradeoff well.\",\"PeriodicalId\":54300,\"journal\":{\"name\":\"IEEE Transactions on Cognitive and Developmental Systems\",\"volume\":\"17 1\",\"pages\":\"219-231\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2024-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Cognitive and Developmental Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10633905/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive and Developmental Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10633905/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

研究了多架无人机在有限续航力的城市环境下为多用户提供通信服务的服务路径问题。我们的目标是学习一种最优的多无人机集中控制策略，使无人机能够通过好奇心驱动的探索找到城市环境中的照明区域，并收集能量，继续为用户提供通信服务。首先，我们提出了一种基于强化学习（RL）的多无人机集中控制策略，以最大化累积通信服务评分。在提出的框架中，好奇心可以作为一种内部激励信号，允许无人机在没有任何先验知识的情况下探索环境。其次，提出了一种两阶段探索协议，用于实际实现。与基线方法相比，我们提出的方法可以在开发密集阶段获得更高的累计通信服务分数。结果表明，该方法能较基线方法获得准确的服务路径，并能较好地处理勘探与开采的权衡问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reinforcement-Learning-Based Multi-Unmanned Aerial Vehicle Optimal Control for Communication Services With Limited Endurance

This article investigates the service path problem of multi-unmanned aerial vehicle (multi-UAV) providing communication services to multiuser in urban environments with limited endurance. Our goal is to learn an optimal multi-UAV centralized control policy that will enable UAVs to find the illumination areas in urban environments through curiosity-driven exploration and harvest energy to continue providing communication services to users. First, we propose a reinforcement learning (RL)-based multi-UAV centralized control strategy to maximize the accumulated communication service score. In the proposed framework, curiosity can act as an internal incentive signal, allowing UAVs to explore the environment without any prior knowledge. Second, a two-phase exploring protocol is proposed for practical implementation. Compared to the baseline method, our proposed method can achieve a significantly higher accumulated communication service score in the exploitation-intensive phase. The results demonstrate that the proposed method can obtain accurate service paths over the baseline method and handle the exploration-exploitation tradeoff well.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Cognitive and Developmental Systems Computer Science-Software

CiteScore

7.20

自引率

10.00%

发文量

170

期刊介绍： The IEEE Transactions on Cognitive and Developmental Systems (TCDS) focuses on advances in the study of development and cognition in natural (humans, animals) and artificial (robots, agents) systems. It welcomes contributions from multiple related disciplines including cognitive systems, cognitive robotics, developmental and epigenetic robotics, autonomous and evolutionary robotics, social structures, multi-agent and artificial life systems, computational neuroscience, and developmental psychology. Articles on theoretical, computational, application-oriented, and experimental studies as well as reviews in these areas are considered.