Tao Wang , Xiaodong Ji , Xuan Zhu , Cheng He , Jian-Feng Gu
{"title":"基于深度强化学习的固定翼无人机辅助移动中继网络运行轨迹路径设计","authors":"Tao Wang , Xiaodong Ji , Xuan Zhu , Cheng He , Jian-Feng Gu","doi":"10.1016/j.vehcom.2024.100851","DOIUrl":null,"url":null,"abstract":"<div><div>This paper studies a fixed-wing unmanned aerial vehicle (UAV) assisted mobile relaying network (FUAVMRN), where a fixed-wing UAV employs an out-band full-duplex relaying fashion to serve a ground source-destination pair. It is confirmed that for a FUAVMRN, straight path is not suitable for the case that a huge amount of data need to be delivered, while circular path may lead to low throughput if the distance of ground source-destination pair is large. Thus, a running-track path (RTP) design problem is investigated for the FUAVMRN with the goal of energy minimization. By dividing an RTP into two straight and two semicircular paths, the total energy consumption of the UAV and the total amount of data transferred from the ground source to the ground destination via the UAV relay are calculated. According to the framework of Deep Reinforcement Learning and taking the UAV's roll-angle limit into consideration, the RTP design problem is formulated as a Markov Decision Process problem, giving the state and action spaces in addition to the policy and reward functions. In order for the UAV relay to obtain the control policy, Deep Deterministic Policy Gradient (DDPG) is used to solve the path design problem, leading to a DDPG based algorithm for the RTP design. Computer simulations are performed and the results show that the DDPG based algorithm always converges when the number of training iterations is around 500, and compared with the circular and straight paths, the proposed RTP design can save at least 12.13 % of energy and 65.93 % of flight time when the ground source and the ground destination are located 2000 m apart and need to transfer <span><math><mrow><mn>5000</mn><mrow><mtext>bit</mtext><mo>/</mo><mtext>Hz</mtext></mrow></mrow></math></span> of data. Moreover, it is more practical and efficient in terms of energy saving compared with the Deep Q Network based design.</div></div>","PeriodicalId":54346,"journal":{"name":"Vehicular Communications","volume":null,"pages":null},"PeriodicalIF":5.8000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Reinforcement Learning based running-track path design for fixed-wing UAV assisted mobile relaying network\",\"authors\":\"Tao Wang , Xiaodong Ji , Xuan Zhu , Cheng He , Jian-Feng Gu\",\"doi\":\"10.1016/j.vehcom.2024.100851\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper studies a fixed-wing unmanned aerial vehicle (UAV) assisted mobile relaying network (FUAVMRN), where a fixed-wing UAV employs an out-band full-duplex relaying fashion to serve a ground source-destination pair. It is confirmed that for a FUAVMRN, straight path is not suitable for the case that a huge amount of data need to be delivered, while circular path may lead to low throughput if the distance of ground source-destination pair is large. Thus, a running-track path (RTP) design problem is investigated for the FUAVMRN with the goal of energy minimization. By dividing an RTP into two straight and two semicircular paths, the total energy consumption of the UAV and the total amount of data transferred from the ground source to the ground destination via the UAV relay are calculated. According to the framework of Deep Reinforcement Learning and taking the UAV's roll-angle limit into consideration, the RTP design problem is formulated as a Markov Decision Process problem, giving the state and action spaces in addition to the policy and reward functions. In order for the UAV relay to obtain the control policy, Deep Deterministic Policy Gradient (DDPG) is used to solve the path design problem, leading to a DDPG based algorithm for the RTP design. Computer simulations are performed and the results show that the DDPG based algorithm always converges when the number of training iterations is around 500, and compared with the circular and straight paths, the proposed RTP design can save at least 12.13 % of energy and 65.93 % of flight time when the ground source and the ground destination are located 2000 m apart and need to transfer <span><math><mrow><mn>5000</mn><mrow><mtext>bit</mtext><mo>/</mo><mtext>Hz</mtext></mrow></mrow></math></span> of data. Moreover, it is more practical and efficient in terms of energy saving compared with the Deep Q Network based design.</div></div>\",\"PeriodicalId\":54346,\"journal\":{\"name\":\"Vehicular Communications\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2024-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Vehicular Communications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2214209624001268\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vehicular Communications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214209624001268","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
Deep Reinforcement Learning based running-track path design for fixed-wing UAV assisted mobile relaying network
This paper studies a fixed-wing unmanned aerial vehicle (UAV) assisted mobile relaying network (FUAVMRN), where a fixed-wing UAV employs an out-band full-duplex relaying fashion to serve a ground source-destination pair. It is confirmed that for a FUAVMRN, straight path is not suitable for the case that a huge amount of data need to be delivered, while circular path may lead to low throughput if the distance of ground source-destination pair is large. Thus, a running-track path (RTP) design problem is investigated for the FUAVMRN with the goal of energy minimization. By dividing an RTP into two straight and two semicircular paths, the total energy consumption of the UAV and the total amount of data transferred from the ground source to the ground destination via the UAV relay are calculated. According to the framework of Deep Reinforcement Learning and taking the UAV's roll-angle limit into consideration, the RTP design problem is formulated as a Markov Decision Process problem, giving the state and action spaces in addition to the policy and reward functions. In order for the UAV relay to obtain the control policy, Deep Deterministic Policy Gradient (DDPG) is used to solve the path design problem, leading to a DDPG based algorithm for the RTP design. Computer simulations are performed and the results show that the DDPG based algorithm always converges when the number of training iterations is around 500, and compared with the circular and straight paths, the proposed RTP design can save at least 12.13 % of energy and 65.93 % of flight time when the ground source and the ground destination are located 2000 m apart and need to transfer of data. Moreover, it is more practical and efficient in terms of energy saving compared with the Deep Q Network based design.
期刊介绍:
Vehicular communications is a growing area of communications between vehicles and including roadside communication infrastructure. Advances in wireless communications are making possible sharing of information through real time communications between vehicles and infrastructure. This has led to applications to increase safety of vehicles and communication between passengers and the Internet. Standardization efforts on vehicular communication are also underway to make vehicular transportation safer, greener and easier.
The aim of the journal is to publish high quality peer–reviewed papers in the area of vehicular communications. The scope encompasses all types of communications involving vehicles, including vehicle–to–vehicle and vehicle–to–infrastructure. The scope includes (but not limited to) the following topics related to vehicular communications:
Vehicle to vehicle and vehicle to infrastructure communications
Channel modelling, modulating and coding
Congestion Control and scalability issues
Protocol design, testing and verification
Routing in vehicular networks
Security issues and countermeasures
Deployment and field testing
Reducing energy consumption and enhancing safety of vehicles
Wireless in–car networks
Data collection and dissemination methods
Mobility and handover issues
Safety and driver assistance applications
UAV
Underwater communications
Autonomous cooperative driving
Social networks
Internet of vehicles
Standardization of protocols.