多无人机目标跟踪协同控制算法研究

2022 International Conference on Cyber-Physical Social Intelligence (ICCSI) Pub Date : 2022-11-18 DOI:10.1109/ICCSI55536.2022.9970622

Haopeng Li, Guoqing Qi, Zhifeng Jin, Yinya Li, A. Sheng

{"title":"多无人机目标跟踪协同控制算法研究","authors":"Haopeng Li, Guoqing Qi, Zhifeng Jin, Yinya Li, A. Sheng","doi":"10.1109/ICCSI55536.2022.9970622","DOIUrl":null,"url":null,"abstract":"Aiming at the problem of collective pursuit of Multiple Unmanned Aerial Vehicles (multi-UAVs) against non-cooperative UAVs, a directional chase strategy based on the Multi-Agent Deep Deterministic Policy Gradient(MADDPG) algorithm is designed by using deep reinforcement learning theory. By designing the algorithm model, state variables, actoin variables and reward function, the UAVs are trained to learn the directional chase strategy on a distributed Actor, centralized Critic structure. It is verified by simulation that the proposed chase strategy has higher learning efficiency while ensuring accuracy than the strategy with distributed training using the Deep Deterministic Policy Gradient (DDPG) algorithm. It also has a higher capture efficiency than the conventional pursuit-only strategy, provides a new research idea for multi- UAV confrontation.","PeriodicalId":421514,"journal":{"name":"2022 International Conference on Cyber-Physical Social Intelligence (ICCSI)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Investigation on Multi-UAVs Cooperative Control Algorithm for Target Chasing\",\"authors\":\"Haopeng Li, Guoqing Qi, Zhifeng Jin, Yinya Li, A. Sheng\",\"doi\":\"10.1109/ICCSI55536.2022.9970622\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aiming at the problem of collective pursuit of Multiple Unmanned Aerial Vehicles (multi-UAVs) against non-cooperative UAVs, a directional chase strategy based on the Multi-Agent Deep Deterministic Policy Gradient(MADDPG) algorithm is designed by using deep reinforcement learning theory. By designing the algorithm model, state variables, actoin variables and reward function, the UAVs are trained to learn the directional chase strategy on a distributed Actor, centralized Critic structure. It is verified by simulation that the proposed chase strategy has higher learning efficiency while ensuring accuracy than the strategy with distributed training using the Deep Deterministic Policy Gradient (DDPG) algorithm. It also has a higher capture efficiency than the conventional pursuit-only strategy, provides a new research idea for multi- UAV confrontation.\",\"PeriodicalId\":421514,\"journal\":{\"name\":\"2022 International Conference on Cyber-Physical Social Intelligence (ICCSI)\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Cyber-Physical Social Intelligence (ICCSI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCSI55536.2022.9970622\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Cyber-Physical Social Intelligence (ICCSI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSI55536.2022.9970622","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

针对多无人机对非合作无人机的集体追击问题，利用深度强化学习理论，设计了一种基于多智能体深度确定性策略梯度(madpg)算法的定向追击策略。通过设计算法模型、状态变量、动作变量和奖励函数，训练无人机在分布式Actor、集中式Critic结构上学习定向追击策略。仿真结果表明，与采用深度确定性策略梯度(Deep Deterministic Policy Gradient, DDPG)算法进行分布式训练的追赶策略相比，所提出的追赶策略在保证准确率的同时具有更高的学习效率。与传统的单追击策略相比，该策略具有更高的捕获效率，为多无人机对抗提供了新的研究思路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Investigation on Multi-UAVs Cooperative Control Algorithm for Target Chasing

Aiming at the problem of collective pursuit of Multiple Unmanned Aerial Vehicles (multi-UAVs) against non-cooperative UAVs, a directional chase strategy based on the Multi-Agent Deep Deterministic Policy Gradient(MADDPG) algorithm is designed by using deep reinforcement learning theory. By designing the algorithm model, state variables, actoin variables and reward function, the UAVs are trained to learn the directional chase strategy on a distributed Actor, centralized Critic structure. It is verified by simulation that the proposed chase strategy has higher learning efficiency while ensuring accuracy than the strategy with distributed training using the Deep Deterministic Policy Gradient (DDPG) algorithm. It also has a higher capture efficiency than the conventional pursuit-only strategy, provides a new research idea for multi- UAV confrontation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 International Conference on Cyber-Physical Social Intelligence (ICCSI)

自引率

0.00%

发文量