Tiantian Dong;Xianlu Song;Yonghong Zhang;Xiayang Qin;Yunping Liu;Zongchun Bai
{"title":"基于深度强化学习的vit任务驱动自主启发式导航","authors":"Tiantian Dong;Xianlu Song;Yonghong Zhang;Xiayang Qin;Yunping Liu;Zongchun Bai","doi":"10.1109/LRA.2025.3557305","DOIUrl":null,"url":null,"abstract":"In unknown environments lacking prior maps, achieving effective visual understanding is crucial for building highly efficient task - driven autonomous navigation systems. In this paper, we propose a vision - enabled goal - oriented autonomous navigation system. This system uses a novel hybrid vision Transformer architecture as the core of its visual perception. Our approach integrates an intermediate waypoint exploration strategy, breaking down a given task into a series of consecutive subtargets. These subtargets are then fed into the scene encoder as an important part of the current physical task state, thereby achieving seamless integration of scene representation and current target information. Based on this, we utilize a deep reinforcement learning framework to develop a local navigation strategy for each subtarget. Given the challenge of addressing the sparse reward function problem, we design a novel hazardous region transfer function.In the simulation experiment stage, we validate the effectiveness of the proposed autonomous navigation system and compare it with other deep - reinforcement - learning - based navigation methods. The experimental results show that our method has significant advantages in terms of navigation success rate and efficiency. Additionally, in the Sim2Real experiments, compared with other algorithms, our method demonstrates greater robustness and mobility.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 6","pages":"5297-5304"},"PeriodicalIF":4.6000,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ViT-Enabled Task-Driven Autonomous Heuristic Navigation Based on Deep Reinforcement Learning\",\"authors\":\"Tiantian Dong;Xianlu Song;Yonghong Zhang;Xiayang Qin;Yunping Liu;Zongchun Bai\",\"doi\":\"10.1109/LRA.2025.3557305\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In unknown environments lacking prior maps, achieving effective visual understanding is crucial for building highly efficient task - driven autonomous navigation systems. In this paper, we propose a vision - enabled goal - oriented autonomous navigation system. This system uses a novel hybrid vision Transformer architecture as the core of its visual perception. Our approach integrates an intermediate waypoint exploration strategy, breaking down a given task into a series of consecutive subtargets. These subtargets are then fed into the scene encoder as an important part of the current physical task state, thereby achieving seamless integration of scene representation and current target information. Based on this, we utilize a deep reinforcement learning framework to develop a local navigation strategy for each subtarget. Given the challenge of addressing the sparse reward function problem, we design a novel hazardous region transfer function.In the simulation experiment stage, we validate the effectiveness of the proposed autonomous navigation system and compare it with other deep - reinforcement - learning - based navigation methods. The experimental results show that our method has significant advantages in terms of navigation success rate and efficiency. Additionally, in the Sim2Real experiments, compared with other algorithms, our method demonstrates greater robustness and mobility.\",\"PeriodicalId\":13241,\"journal\":{\"name\":\"IEEE Robotics and Automation Letters\",\"volume\":\"10 6\",\"pages\":\"5297-5304\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-04-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Robotics and Automation Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10947526/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10947526/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
ViT-Enabled Task-Driven Autonomous Heuristic Navigation Based on Deep Reinforcement Learning
In unknown environments lacking prior maps, achieving effective visual understanding is crucial for building highly efficient task - driven autonomous navigation systems. In this paper, we propose a vision - enabled goal - oriented autonomous navigation system. This system uses a novel hybrid vision Transformer architecture as the core of its visual perception. Our approach integrates an intermediate waypoint exploration strategy, breaking down a given task into a series of consecutive subtargets. These subtargets are then fed into the scene encoder as an important part of the current physical task state, thereby achieving seamless integration of scene representation and current target information. Based on this, we utilize a deep reinforcement learning framework to develop a local navigation strategy for each subtarget. Given the challenge of addressing the sparse reward function problem, we design a novel hazardous region transfer function.In the simulation experiment stage, we validate the effectiveness of the proposed autonomous navigation system and compare it with other deep - reinforcement - learning - based navigation methods. The experimental results show that our method has significant advantages in terms of navigation success rate and efficiency. Additionally, in the Sim2Real experiments, compared with other algorithms, our method demonstrates greater robustness and mobility.
期刊介绍:
The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.