Bin Fan;Haifeng Yuan;Yinke Dong;Zhengyu Zhu;Hongmin Liu
{"title":"Bidirectional Agent-Map Interaction Feature Learning Leveraged by Map-Related Tasks for Trajectory Prediction in Autonomous Driving","authors":"Bin Fan;Haifeng Yuan;Yinke Dong;Zhengyu Zhu;Hongmin Liu","doi":"10.1109/TASE.2025.3529736","DOIUrl":null,"url":null,"abstract":"Accurate prediction of the future trajectories of surrounding agents is essential for safely autonomous vehicles. However, it is quite challenging due to the dynamics of driving environments caused by frequent agent-agent and agent-map interactions. Most existing methods mainly focus on modeling the dynamics of agents but overlook the inherent dynamics of the usable map at different driving moments. To address this limitation, this paper proposes a bidirectional interaction network (DyMap) that takes into account the bilateral constraints of the map on the agents and the impact of the agents on the map, using a unique bidirectional attention module. By employing multiple stages of bidirectional interaction modules, the features of agents and map nodes are continuously updated to adapt to changing environments. To facilitate network learning, we further incorporate map-related auxiliary tasks into the learning objectives, including traffic flow prediction, map node occupancy prediction, and road direction prediction. These tasks are designed to help the network learn map features that can reflect the dynamics of the driving environment, leading to a comprehensive feature representation of the agents. This representation encodes not only the agent’s historical motions but also the latent map information triggered by other agents. This proposed network offers a new approach to modeling environmental dynamics in driving scenarios, and achieves the state-of-the-art performance on the Argoverse 1 and Argoverse 2 trajectory prediction benchmarks. Note to Practitioners—Autonomous vehicles are of particular interest to practitioners in the field of automation science and technology. Trajectory prediction is of critical importance to the safety of autonomous driving, which aims to forecast the future positions of a given traffic participator (called the agent in this paper, such as a vehicle, pedestrian, etc.) based on the observations of itself and other agents’ historical trajectories as well as the map information. This paper proposes a trajectory prediction network based on a novel bidirectional interaction module that enables bidirectional information flow between the features of map nodes and agents so as to simultaneously learn their feature representations being aware of the timing-vary property of traffic scenes. The proposed DyMap contains three blocks of bidirectional interaction modules, which are interleaved with self-interactions of the agents and map nodes. Three different kinds of map-related tasks are developed to facilitate network training by leveraging on maximizing the map features for predicting traffic flow, occupancy, and lane direction. The superiority of the proposed method has been confirmed by the widely used benchmarks in the community. Besides modelling the complex interaction between agents and map nodes in dynamic traffic scenarios, the proposed bidirectional interaction network can be beneficial to address other problems requiring symmetric modelling of interactions between different components, such as human-robot interaction. The proposed map-related tasks can be directly used in other applications requiring to extract map features while considering the time-varying conditions, including but not limited to robot navigation and embody intelligence.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"10801-10813"},"PeriodicalIF":6.4000,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10841390/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate prediction of the future trajectories of surrounding agents is essential for safely autonomous vehicles. However, it is quite challenging due to the dynamics of driving environments caused by frequent agent-agent and agent-map interactions. Most existing methods mainly focus on modeling the dynamics of agents but overlook the inherent dynamics of the usable map at different driving moments. To address this limitation, this paper proposes a bidirectional interaction network (DyMap) that takes into account the bilateral constraints of the map on the agents and the impact of the agents on the map, using a unique bidirectional attention module. By employing multiple stages of bidirectional interaction modules, the features of agents and map nodes are continuously updated to adapt to changing environments. To facilitate network learning, we further incorporate map-related auxiliary tasks into the learning objectives, including traffic flow prediction, map node occupancy prediction, and road direction prediction. These tasks are designed to help the network learn map features that can reflect the dynamics of the driving environment, leading to a comprehensive feature representation of the agents. This representation encodes not only the agent’s historical motions but also the latent map information triggered by other agents. This proposed network offers a new approach to modeling environmental dynamics in driving scenarios, and achieves the state-of-the-art performance on the Argoverse 1 and Argoverse 2 trajectory prediction benchmarks. Note to Practitioners—Autonomous vehicles are of particular interest to practitioners in the field of automation science and technology. Trajectory prediction is of critical importance to the safety of autonomous driving, which aims to forecast the future positions of a given traffic participator (called the agent in this paper, such as a vehicle, pedestrian, etc.) based on the observations of itself and other agents’ historical trajectories as well as the map information. This paper proposes a trajectory prediction network based on a novel bidirectional interaction module that enables bidirectional information flow between the features of map nodes and agents so as to simultaneously learn their feature representations being aware of the timing-vary property of traffic scenes. The proposed DyMap contains three blocks of bidirectional interaction modules, which are interleaved with self-interactions of the agents and map nodes. Three different kinds of map-related tasks are developed to facilitate network training by leveraging on maximizing the map features for predicting traffic flow, occupancy, and lane direction. The superiority of the proposed method has been confirmed by the widely used benchmarks in the community. Besides modelling the complex interaction between agents and map nodes in dynamic traffic scenarios, the proposed bidirectional interaction network can be beneficial to address other problems requiring symmetric modelling of interactions between different components, such as human-robot interaction. The proposed map-related tasks can be directly used in other applications requiring to extract map features while considering the time-varying conditions, including but not limited to robot navigation and embody intelligence.
期刊介绍:
The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.