Haopeng Li, Guoqing Qi, Zhifeng Jin, Yinya Li, A. Sheng
{"title":"多无人机目标跟踪协同控制算法研究","authors":"Haopeng Li, Guoqing Qi, Zhifeng Jin, Yinya Li, A. Sheng","doi":"10.1109/ICCSI55536.2022.9970622","DOIUrl":null,"url":null,"abstract":"Aiming at the problem of collective pursuit of Multiple Unmanned Aerial Vehicles (multi-UAVs) against non-cooperative UAVs, a directional chase strategy based on the Multi-Agent Deep Deterministic Policy Gradient(MADDPG) algorithm is designed by using deep reinforcement learning theory. By designing the algorithm model, state variables, actoin variables and reward function, the UAVs are trained to learn the directional chase strategy on a distributed Actor, centralized Critic structure. It is verified by simulation that the proposed chase strategy has higher learning efficiency while ensuring accuracy than the strategy with distributed training using the Deep Deterministic Policy Gradient (DDPG) algorithm. It also has a higher capture efficiency than the conventional pursuit-only strategy, provides a new research idea for multi- UAV confrontation.","PeriodicalId":421514,"journal":{"name":"2022 International Conference on Cyber-Physical Social Intelligence (ICCSI)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Investigation on Multi-UAVs Cooperative Control Algorithm for Target Chasing\",\"authors\":\"Haopeng Li, Guoqing Qi, Zhifeng Jin, Yinya Li, A. Sheng\",\"doi\":\"10.1109/ICCSI55536.2022.9970622\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aiming at the problem of collective pursuit of Multiple Unmanned Aerial Vehicles (multi-UAVs) against non-cooperative UAVs, a directional chase strategy based on the Multi-Agent Deep Deterministic Policy Gradient(MADDPG) algorithm is designed by using deep reinforcement learning theory. By designing the algorithm model, state variables, actoin variables and reward function, the UAVs are trained to learn the directional chase strategy on a distributed Actor, centralized Critic structure. It is verified by simulation that the proposed chase strategy has higher learning efficiency while ensuring accuracy than the strategy with distributed training using the Deep Deterministic Policy Gradient (DDPG) algorithm. It also has a higher capture efficiency than the conventional pursuit-only strategy, provides a new research idea for multi- UAV confrontation.\",\"PeriodicalId\":421514,\"journal\":{\"name\":\"2022 International Conference on Cyber-Physical Social Intelligence (ICCSI)\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Cyber-Physical Social Intelligence (ICCSI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCSI55536.2022.9970622\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Cyber-Physical Social Intelligence (ICCSI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSI55536.2022.9970622","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Investigation on Multi-UAVs Cooperative Control Algorithm for Target Chasing
Aiming at the problem of collective pursuit of Multiple Unmanned Aerial Vehicles (multi-UAVs) against non-cooperative UAVs, a directional chase strategy based on the Multi-Agent Deep Deterministic Policy Gradient(MADDPG) algorithm is designed by using deep reinforcement learning theory. By designing the algorithm model, state variables, actoin variables and reward function, the UAVs are trained to learn the directional chase strategy on a distributed Actor, centralized Critic structure. It is verified by simulation that the proposed chase strategy has higher learning efficiency while ensuring accuracy than the strategy with distributed training using the Deep Deterministic Policy Gradient (DDPG) algorithm. It also has a higher capture efficiency than the conventional pursuit-only strategy, provides a new research idea for multi- UAV confrontation.