{"title":"基于强化学习的无人机三维目标跟踪与融合传感与通信的数字双辅助避碰","authors":"Minghao Chen;Feng Shu;Min Zhu;Di Wu;Yu Yao;Qi Zhang","doi":"10.1109/JIOT.2025.3559078","DOIUrl":null,"url":null,"abstract":"The flexibility and maneuverability of autonomous aerial vehicles (AAVs) lend themselves to tracking users and operating as an aerial base station carrying out communication enhancement functionality. A core challenge neglected by most existing works is that the true-but-unknown obstacles can jeopardize AAV flight security and shadow its communication links with users, resulting in poor achievable rate and high collision risks. In this article, a deep-reinforcement-learning (DRL)-based AAV target tracking and digital-twin (DT)-assisted collision avoidance method is proposed to optimize AAV’s communication performance while tracking moving users. Toward this end, twin delayed deep deterministic policy gradient (TD3) as a novel and policy-based DRL algorithm is used to construct an agent responsible for adaptive deciding AAV flying control actions. To efficiently detect unknown obstacles in a flight environment, an orthogonal frequency-division multiple (OFDM)-based integrated sensing and communication (ISAC) system is investigated, endowing AAV’s agent with real-time obstacle distance. Finally, we present a DT obstacle model construction mechanism and integrate it with TD3 agent training. The extensive simulations demonstrate the reward convergence of the TD3 algorithm and the communication improvement with stable user tracking and reliable collision avoidance, compared with conventional approaches.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 13","pages":"24916-24928"},"PeriodicalIF":8.9000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement-Learning-Based AAV 3-D Target Tracking and Digital-Twin-Assisted Collision Avoidance With Integrated Sensing and Communication\",\"authors\":\"Minghao Chen;Feng Shu;Min Zhu;Di Wu;Yu Yao;Qi Zhang\",\"doi\":\"10.1109/JIOT.2025.3559078\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The flexibility and maneuverability of autonomous aerial vehicles (AAVs) lend themselves to tracking users and operating as an aerial base station carrying out communication enhancement functionality. A core challenge neglected by most existing works is that the true-but-unknown obstacles can jeopardize AAV flight security and shadow its communication links with users, resulting in poor achievable rate and high collision risks. In this article, a deep-reinforcement-learning (DRL)-based AAV target tracking and digital-twin (DT)-assisted collision avoidance method is proposed to optimize AAV’s communication performance while tracking moving users. Toward this end, twin delayed deep deterministic policy gradient (TD3) as a novel and policy-based DRL algorithm is used to construct an agent responsible for adaptive deciding AAV flying control actions. To efficiently detect unknown obstacles in a flight environment, an orthogonal frequency-division multiple (OFDM)-based integrated sensing and communication (ISAC) system is investigated, endowing AAV’s agent with real-time obstacle distance. Finally, we present a DT obstacle model construction mechanism and integrate it with TD3 agent training. The extensive simulations demonstrate the reward convergence of the TD3 algorithm and the communication improvement with stable user tracking and reliable collision avoidance, compared with conventional approaches.\",\"PeriodicalId\":54347,\"journal\":{\"name\":\"IEEE Internet of Things Journal\",\"volume\":\"12 13\",\"pages\":\"24916-24928\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-04-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Internet of Things Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10960359/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10960359/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Reinforcement-Learning-Based AAV 3-D Target Tracking and Digital-Twin-Assisted Collision Avoidance With Integrated Sensing and Communication
The flexibility and maneuverability of autonomous aerial vehicles (AAVs) lend themselves to tracking users and operating as an aerial base station carrying out communication enhancement functionality. A core challenge neglected by most existing works is that the true-but-unknown obstacles can jeopardize AAV flight security and shadow its communication links with users, resulting in poor achievable rate and high collision risks. In this article, a deep-reinforcement-learning (DRL)-based AAV target tracking and digital-twin (DT)-assisted collision avoidance method is proposed to optimize AAV’s communication performance while tracking moving users. Toward this end, twin delayed deep deterministic policy gradient (TD3) as a novel and policy-based DRL algorithm is used to construct an agent responsible for adaptive deciding AAV flying control actions. To efficiently detect unknown obstacles in a flight environment, an orthogonal frequency-division multiple (OFDM)-based integrated sensing and communication (ISAC) system is investigated, endowing AAV’s agent with real-time obstacle distance. Finally, we present a DT obstacle model construction mechanism and integrate it with TD3 agent training. The extensive simulations demonstrate the reward convergence of the TD3 algorithm and the communication improvement with stable user tracking and reliable collision avoidance, compared with conventional approaches.
期刊介绍:
The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.