基于地图相关任务的双向Agent-Map交互特征学习用于自动驾驶轨迹预测

IF 6.4 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Automation Science and Engineering Pub Date : 2025-01-14 DOI:10.1109/TASE.2025.3529736

Bin Fan;Haifeng Yuan;Yinke Dong;Zhengyu Zhu;Hongmin Liu

{"title":"基于地图相关任务的双向Agent-Map交互特征学习用于自动驾驶轨迹预测","authors":"Bin Fan;Haifeng Yuan;Yinke Dong;Zhengyu Zhu;Hongmin Liu","doi":"10.1109/TASE.2025.3529736","DOIUrl":null,"url":null,"abstract":"Accurate prediction of the future trajectories of surrounding agents is essential for safely autonomous vehicles. However, it is quite challenging due to the dynamics of driving environments caused by frequent agent-agent and agent-map interactions. Most existing methods mainly focus on modeling the dynamics of agents but overlook the inherent dynamics of the usable map at different driving moments. To address this limitation, this paper proposes a bidirectional interaction network (DyMap) that takes into account the bilateral constraints of the map on the agents and the impact of the agents on the map, using a unique bidirectional attention module. By employing multiple stages of bidirectional interaction modules, the features of agents and map nodes are continuously updated to adapt to changing environments. To facilitate network learning, we further incorporate map-related auxiliary tasks into the learning objectives, including traffic flow prediction, map node occupancy prediction, and road direction prediction. These tasks are designed to help the network learn map features that can reflect the dynamics of the driving environment, leading to a comprehensive feature representation of the agents. This representation encodes not only the agent’s historical motions but also the latent map information triggered by other agents. This proposed network offers a new approach to modeling environmental dynamics in driving scenarios, and achieves the state-of-the-art performance on the Argoverse 1 and Argoverse 2 trajectory prediction benchmarks. Note to Practitioners—Autonomous vehicles are of particular interest to practitioners in the field of automation science and technology. Trajectory prediction is of critical importance to the safety of autonomous driving, which aims to forecast the future positions of a given traffic participator (called the agent in this paper, such as a vehicle, pedestrian, etc.) based on the observations of itself and other agents’ historical trajectories as well as the map information. This paper proposes a trajectory prediction network based on a novel bidirectional interaction module that enables bidirectional information flow between the features of map nodes and agents so as to simultaneously learn their feature representations being aware of the timing-vary property of traffic scenes. The proposed DyMap contains three blocks of bidirectional interaction modules, which are interleaved with self-interactions of the agents and map nodes. Three different kinds of map-related tasks are developed to facilitate network training by leveraging on maximizing the map features for predicting traffic flow, occupancy, and lane direction. The superiority of the proposed method has been confirmed by the widely used benchmarks in the community. Besides modelling the complex interaction between agents and map nodes in dynamic traffic scenarios, the proposed bidirectional interaction network can be beneficial to address other problems requiring symmetric modelling of interactions between different components, such as human-robot interaction. The proposed map-related tasks can be directly used in other applications requiring to extract map features while considering the time-varying conditions, including but not limited to robot navigation and embody intelligence.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"10801-10813"},"PeriodicalIF":6.4000,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bidirectional Agent-Map Interaction Feature Learning Leveraged by Map-Related Tasks for Trajectory Prediction in Autonomous Driving\",\"authors\":\"Bin Fan;Haifeng Yuan;Yinke Dong;Zhengyu Zhu;Hongmin Liu\",\"doi\":\"10.1109/TASE.2025.3529736\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accurate prediction of the future trajectories of surrounding agents is essential for safely autonomous vehicles. However, it is quite challenging due to the dynamics of driving environments caused by frequent agent-agent and agent-map interactions. Most existing methods mainly focus on modeling the dynamics of agents but overlook the inherent dynamics of the usable map at different driving moments. To address this limitation, this paper proposes a bidirectional interaction network (DyMap) that takes into account the bilateral constraints of the map on the agents and the impact of the agents on the map, using a unique bidirectional attention module. By employing multiple stages of bidirectional interaction modules, the features of agents and map nodes are continuously updated to adapt to changing environments. To facilitate network learning, we further incorporate map-related auxiliary tasks into the learning objectives, including traffic flow prediction, map node occupancy prediction, and road direction prediction. These tasks are designed to help the network learn map features that can reflect the dynamics of the driving environment, leading to a comprehensive feature representation of the agents. This representation encodes not only the agent’s historical motions but also the latent map information triggered by other agents. This proposed network offers a new approach to modeling environmental dynamics in driving scenarios, and achieves the state-of-the-art performance on the Argoverse 1 and Argoverse 2 trajectory prediction benchmarks. Note to Practitioners—Autonomous vehicles are of particular interest to practitioners in the field of automation science and technology. Trajectory prediction is of critical importance to the safety of autonomous driving, which aims to forecast the future positions of a given traffic participator (called the agent in this paper, such as a vehicle, pedestrian, etc.) based on the observations of itself and other agents’ historical trajectories as well as the map information. This paper proposes a trajectory prediction network based on a novel bidirectional interaction module that enables bidirectional information flow between the features of map nodes and agents so as to simultaneously learn their feature representations being aware of the timing-vary property of traffic scenes. The proposed DyMap contains three blocks of bidirectional interaction modules, which are interleaved with self-interactions of the agents and map nodes. Three different kinds of map-related tasks are developed to facilitate network training by leveraging on maximizing the map features for predicting traffic flow, occupancy, and lane direction. The superiority of the proposed method has been confirmed by the widely used benchmarks in the community. Besides modelling the complex interaction between agents and map nodes in dynamic traffic scenarios, the proposed bidirectional interaction network can be beneficial to address other problems requiring symmetric modelling of interactions between different components, such as human-robot interaction. The proposed map-related tasks can be directly used in other applications requiring to extract map features while considering the time-varying conditions, including but not limited to robot navigation and embody intelligence.\",\"PeriodicalId\":51060,\"journal\":{\"name\":\"IEEE Transactions on Automation Science and Engineering\",\"volume\":\"22 \",\"pages\":\"10801-10813\"},\"PeriodicalIF\":6.4000,\"publicationDate\":\"2025-01-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Automation Science and Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10841390/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10841390/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

准确预测周围智能体的未来轨迹对安全自动驾驶汽车至关重要。然而，由于频繁的代理-代理和代理-映射交互导致的驾驶环境的动态性，这是相当具有挑战性的。现有的大多数方法主要关注智能体的动态建模，而忽略了不同驾驶时刻可用地图的内在动态。为了解决这一限制，本文提出了一个双向交互网络（DyMap），该网络使用独特的双向注意模块，考虑了地图对智能体的双边约束和智能体对地图的影响。通过采用多阶段的双向交互模块，不断更新agent和地图节点的特征，以适应不断变化的环境。为了便于网络学习，我们进一步在学习目标中加入了与地图相关的辅助任务，包括交通流预测、地图节点占用率预测和道路方向预测。这些任务旨在帮助网络学习能够反映驾驶环境动态的地图特征，从而获得智能体的全面特征表示。这种表示不仅编码智能体的历史运动，而且编码由其他智能体触发的潜在地图信息。该网络为模拟驾驶场景中的环境动力学提供了一种新方法，并在Argoverse 1和Argoverse 2轨迹预测基准上实现了最先进的性能。给从业人员的说明——自动化科学和技术领域的从业人员对自动驾驶汽车特别感兴趣。轨迹预测对自动驾驶的安全性至关重要，其目的是基于对自身和其他智能体历史轨迹的观察以及地图信息，预测给定交通参与者（本文称为智能体，如车辆、行人等）未来的位置。本文提出了一种基于新型双向交互模块的轨迹预测网络，在了解交通场景时变特性的情况下，实现地图节点和智能体特征之间的双向信息流，同时学习其特征表示。所提出的DyMap包含三个双向交互模块块，它们与代理和地图节点的自交互相互交织。开发了三种不同类型的地图相关任务，通过最大化地图特征来预测交通流量、占用率和车道方向，从而促进网络训练。该方法的优越性已被社会上广泛使用的基准测试所证实。除了对动态交通场景中智能体和地图节点之间复杂的交互建模外，所提出的双向交互网络还有助于解决其他需要对不同组件之间的交互进行对称建模的问题，如人机交互。所提出的地图相关任务可以直接用于其他需要在考虑时变条件的情况下提取地图特征的应用，包括但不限于机器人导航和体现智能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Bidirectional Agent-Map Interaction Feature Learning Leveraged by Map-Related Tasks for Trajectory Prediction in Autonomous Driving

Accurate prediction of the future trajectories of surrounding agents is essential for safely autonomous vehicles. However, it is quite challenging due to the dynamics of driving environments caused by frequent agent-agent and agent-map interactions. Most existing methods mainly focus on modeling the dynamics of agents but overlook the inherent dynamics of the usable map at different driving moments. To address this limitation, this paper proposes a bidirectional interaction network (DyMap) that takes into account the bilateral constraints of the map on the agents and the impact of the agents on the map, using a unique bidirectional attention module. By employing multiple stages of bidirectional interaction modules, the features of agents and map nodes are continuously updated to adapt to changing environments. To facilitate network learning, we further incorporate map-related auxiliary tasks into the learning objectives, including traffic flow prediction, map node occupancy prediction, and road direction prediction. These tasks are designed to help the network learn map features that can reflect the dynamics of the driving environment, leading to a comprehensive feature representation of the agents. This representation encodes not only the agent’s historical motions but also the latent map information triggered by other agents. This proposed network offers a new approach to modeling environmental dynamics in driving scenarios, and achieves the state-of-the-art performance on the Argoverse 1 and Argoverse 2 trajectory prediction benchmarks. Note to Practitioners—Autonomous vehicles are of particular interest to practitioners in the field of automation science and technology. Trajectory prediction is of critical importance to the safety of autonomous driving, which aims to forecast the future positions of a given traffic participator (called the agent in this paper, such as a vehicle, pedestrian, etc.) based on the observations of itself and other agents’ historical trajectories as well as the map information. This paper proposes a trajectory prediction network based on a novel bidirectional interaction module that enables bidirectional information flow between the features of map nodes and agents so as to simultaneously learn their feature representations being aware of the timing-vary property of traffic scenes. The proposed DyMap contains three blocks of bidirectional interaction modules, which are interleaved with self-interactions of the agents and map nodes. Three different kinds of map-related tasks are developed to facilitate network training by leveraging on maximizing the map features for predicting traffic flow, occupancy, and lane direction. The superiority of the proposed method has been confirmed by the widely used benchmarks in the community. Besides modelling the complex interaction between agents and map nodes in dynamic traffic scenarios, the proposed bidirectional interaction network can be beneficial to address other problems requiring symmetric modelling of interactions between different components, such as human-robot interaction. The proposed map-related tasks can be directly used in other applications requiring to extract map features while considering the time-varying conditions, including but not limited to robot navigation and embody intelligence.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统

CiteScore

12.50

自引率

14.30%

发文量

404

审稿时长

3.0 months

期刊介绍： The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.