基于强化学习的无人机三维目标跟踪与融合传感与通信的数字双辅助避碰

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Internet of Things Journal Pub Date : 2025-04-09 DOI:10.1109/JIOT.2025.3559078

Minghao Chen;Feng Shu;Min Zhu;Di Wu;Yu Yao;Qi Zhang

{"title":"基于强化学习的无人机三维目标跟踪与融合传感与通信的数字双辅助避碰","authors":"Minghao Chen;Feng Shu;Min Zhu;Di Wu;Yu Yao;Qi Zhang","doi":"10.1109/JIOT.2025.3559078","DOIUrl":null,"url":null,"abstract":"The flexibility and maneuverability of autonomous aerial vehicles (AAVs) lend themselves to tracking users and operating as an aerial base station carrying out communication enhancement functionality. A core challenge neglected by most existing works is that the true-but-unknown obstacles can jeopardize AAV flight security and shadow its communication links with users, resulting in poor achievable rate and high collision risks. In this article, a deep-reinforcement-learning (DRL)-based AAV target tracking and digital-twin (DT)-assisted collision avoidance method is proposed to optimize AAV’s communication performance while tracking moving users. Toward this end, twin delayed deep deterministic policy gradient (TD3) as a novel and policy-based DRL algorithm is used to construct an agent responsible for adaptive deciding AAV flying control actions. To efficiently detect unknown obstacles in a flight environment, an orthogonal frequency-division multiple (OFDM)-based integrated sensing and communication (ISAC) system is investigated, endowing AAV’s agent with real-time obstacle distance. Finally, we present a DT obstacle model construction mechanism and integrate it with TD3 agent training. The extensive simulations demonstrate the reward convergence of the TD3 algorithm and the communication improvement with stable user tracking and reliable collision avoidance, compared with conventional approaches.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 13","pages":"24916-24928"},"PeriodicalIF":8.9000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement-Learning-Based AAV 3-D Target Tracking and Digital-Twin-Assisted Collision Avoidance With Integrated Sensing and Communication\",\"authors\":\"Minghao Chen;Feng Shu;Min Zhu;Di Wu;Yu Yao;Qi Zhang\",\"doi\":\"10.1109/JIOT.2025.3559078\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The flexibility and maneuverability of autonomous aerial vehicles (AAVs) lend themselves to tracking users and operating as an aerial base station carrying out communication enhancement functionality. A core challenge neglected by most existing works is that the true-but-unknown obstacles can jeopardize AAV flight security and shadow its communication links with users, resulting in poor achievable rate and high collision risks. In this article, a deep-reinforcement-learning (DRL)-based AAV target tracking and digital-twin (DT)-assisted collision avoidance method is proposed to optimize AAV’s communication performance while tracking moving users. Toward this end, twin delayed deep deterministic policy gradient (TD3) as a novel and policy-based DRL algorithm is used to construct an agent responsible for adaptive deciding AAV flying control actions. To efficiently detect unknown obstacles in a flight environment, an orthogonal frequency-division multiple (OFDM)-based integrated sensing and communication (ISAC) system is investigated, endowing AAV’s agent with real-time obstacle distance. Finally, we present a DT obstacle model construction mechanism and integrate it with TD3 agent training. The extensive simulations demonstrate the reward convergence of the TD3 algorithm and the communication improvement with stable user tracking and reliable collision avoidance, compared with conventional approaches.\",\"PeriodicalId\":54347,\"journal\":{\"name\":\"IEEE Internet of Things Journal\",\"volume\":\"12 13\",\"pages\":\"24916-24928\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-04-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Internet of Things Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10960359/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10960359/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

自主飞行器（aav）的灵活性和可操作性使其能够跟踪用户并作为执行通信增强功能的空中基站运行。大多数现有工作忽略的一个核心挑战是，真实但未知的障碍物会危及AAV的飞行安全，并影响其与用户的通信链路，导致可达率低，碰撞风险高。为了优化AAV在跟踪移动用户时的通信性能，提出了一种基于深度强化学习（DRL）的AAV目标跟踪和数字孪生（DT）辅助避撞方法。为此，采用双延迟深度确定性策略梯度（TD3）作为一种新颖的基于策略的DRL算法，构建了负责自适应决策AAV飞行控制动作的智能体。为了有效地检测飞行环境中的未知障碍物，研究了一种基于正交频分复用（OFDM）的集成传感与通信（ISAC）系统，赋予AAV智能体实时障碍物距离。最后，我们提出了一种DT障碍模型构建机制，并将其与TD3智能体训练相结合。大量的仿真表明，与传统方法相比，TD3算法具有奖励收敛性，并且具有稳定的用户跟踪和可靠的避免碰撞的通信性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reinforcement-Learning-Based AAV 3-D Target Tracking and Digital-Twin-Assisted Collision Avoidance With Integrated Sensing and Communication

The flexibility and maneuverability of autonomous aerial vehicles (AAVs) lend themselves to tracking users and operating as an aerial base station carrying out communication enhancement functionality. A core challenge neglected by most existing works is that the true-but-unknown obstacles can jeopardize AAV flight security and shadow its communication links with users, resulting in poor achievable rate and high collision risks. In this article, a deep-reinforcement-learning (DRL)-based AAV target tracking and digital-twin (DT)-assisted collision avoidance method is proposed to optimize AAV’s communication performance while tracking moving users. Toward this end, twin delayed deep deterministic policy gradient (TD3) as a novel and policy-based DRL algorithm is used to construct an agent responsible for adaptive deciding AAV flying control actions. To efficiently detect unknown obstacles in a flight environment, an orthogonal frequency-division multiple (OFDM)-based integrated sensing and communication (ISAC) system is investigated, endowing AAV’s agent with real-time obstacle distance. Finally, we present a DT obstacle model construction mechanism and integrate it with TD3 agent training. The extensive simulations demonstrate the reward convergence of the TD3 algorithm and the communication improvement with stable user tracking and reliable collision avoidance, compared with conventional approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Internet of Things Journal Computer Science-Information Systems

CiteScore

17.60

自引率

13.20%

发文量

1982

期刊介绍： The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.