{"title":"SRAD: Autonomous Decision-Making Method for UAV Based on Safety Reinforcement Learning","authors":"Wenwen Xiao, Xiangfeng Luo, Shaorong Xie","doi":"10.1111/exsy.70004","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Unmanned aerial vehicles (UAVs) are increasingly vital across numerous sectors, from logistics and rescue operations to military endeavours and beyond. However, ensuring safety in the decision-making processes surrounding UAV operations in real-world settings has become an urgent and complex challenge. At present, the main methods to minimise the risk of drone decision-making include utilising pre-established control rules, expert prior knowledge and regularisation constraints. However, these methodologies require UAVs to meet demanding prerequisites, including the acquisition of extensive decision-making experience and the establishment of comprehensive rules. Regrettably, these strict requirements often lead to frequent UAV crashes in uncertain environments and subsequent mission failures. In order to tackle these issues, we propose a self-decision-making method for quadcopter UAVs based on safe reinforcement learning. Our method utilises a multilevel cascading feature semantic space for reinforcement learning, integrating depth images, greyscale images, semantic segmentation images and object detection results as inputs. This approach aims to facilitate safe autonomous learning. Moreover, we integrate real offline labelled data to enhance the safety policy. Depending on the varying levels of risk encountered during the UAV's decision-making process, we dynamically select different safety policies. Through this iterative process, the UAV progressively eliminates extreme actions and reverts to the UAV learning policy module. Experimental results indicate that our method not only ensures safe decision-making for UAVs in uncertain environments but also exhibits superior safety decision-making efficacy compared to certain baseline methods.</p>\n </div>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 5","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/exsy.70004","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Unmanned aerial vehicles (UAVs) are increasingly vital across numerous sectors, from logistics and rescue operations to military endeavours and beyond. However, ensuring safety in the decision-making processes surrounding UAV operations in real-world settings has become an urgent and complex challenge. At present, the main methods to minimise the risk of drone decision-making include utilising pre-established control rules, expert prior knowledge and regularisation constraints. However, these methodologies require UAVs to meet demanding prerequisites, including the acquisition of extensive decision-making experience and the establishment of comprehensive rules. Regrettably, these strict requirements often lead to frequent UAV crashes in uncertain environments and subsequent mission failures. In order to tackle these issues, we propose a self-decision-making method for quadcopter UAVs based on safe reinforcement learning. Our method utilises a multilevel cascading feature semantic space for reinforcement learning, integrating depth images, greyscale images, semantic segmentation images and object detection results as inputs. This approach aims to facilitate safe autonomous learning. Moreover, we integrate real offline labelled data to enhance the safety policy. Depending on the varying levels of risk encountered during the UAV's decision-making process, we dynamically select different safety policies. Through this iterative process, the UAV progressively eliminates extreme actions and reverts to the UAV learning policy module. Experimental results indicate that our method not only ensures safe decision-making for UAVs in uncertain environments but also exhibits superior safety decision-making efficacy compared to certain baseline methods.
期刊介绍:
Expert Systems: The Journal of Knowledge Engineering publishes papers dealing with all aspects of knowledge engineering, including individual methods and techniques in knowledge acquisition and representation, and their application in the construction of systems – including expert systems – based thereon. Detailed scientific evaluation is an essential part of any paper.
As well as traditional application areas, such as Software and Requirements Engineering, Human-Computer Interaction, and Artificial Intelligence, we are aiming at the new and growing markets for these technologies, such as Business, Economy, Market Research, and Medical and Health Care. The shift towards this new focus will be marked by a series of special issues covering hot and emergent topics.