Titong Jiang, Qing Dong, Yuan Ma, Xuewu Ji, Yahui Liu
{"title":"基于兴趣节点选择的自定义多模态轨迹预测","authors":"Titong Jiang, Qing Dong, Yuan Ma, Xuewu Ji, Yahui Liu","doi":"10.1016/j.eswa.2025.128222","DOIUrl":null,"url":null,"abstract":"<div><div>To safely navigate through complex traffic scenarios, autonomous vehicles (AVs) must accurately predict the future trajectories of surrounding agents. Therefore, there has been a surge of interest in the problem of trajectory prediction for AVs. Building upon existing studies, we aim to push the boundaries of state-of-the-art research by tackling the following challenges: (1) the interaction between agents is heavily dependent on road geometry and topology; (2) certain modalities of the surrounding agent are non-informative for the AV and can be disregarded; and (3) the diversity of multimodal prediction is limited by the maximum number of modalities. In this study, we propose Customizable Multimodal Transformer (CMT), a deep learning model which facilitates customizable multimodal trajectory prediction. First, inspired by the dependency between agent interaction and road geometry and topology, we propose that map information can be utilized to better understand agent interaction. Furthermore, we propose the concept of nodes of interest (NOI), which represents the area of interest of the AV. By manipulating the nodes in the NOI, CMT can generate customized prediction results where irrelevant modalities can be disregarded without compromising the safety of the AV, leading to reduced computational costs. Finally, we propose to enhance the diversity of multimodal prediction results through Gaussian mixture reduction via clustering (GMRC). Extensive experiments on nuScenes and Argoverse datasets demonstrate that CMT not only outperforms previous state-of-the-art models, but also exhibits great potential for reducing computational costs and improving inference speed for trajectory prediction of AVs. Code is available at https://github.com/Promisery/CMT.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"288 ","pages":"Article 128222"},"PeriodicalIF":7.5000,"publicationDate":"2025-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Customizable multimodal trajectory prediction via nodes of interest selection for autonomous vehicles\",\"authors\":\"Titong Jiang, Qing Dong, Yuan Ma, Xuewu Ji, Yahui Liu\",\"doi\":\"10.1016/j.eswa.2025.128222\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>To safely navigate through complex traffic scenarios, autonomous vehicles (AVs) must accurately predict the future trajectories of surrounding agents. Therefore, there has been a surge of interest in the problem of trajectory prediction for AVs. Building upon existing studies, we aim to push the boundaries of state-of-the-art research by tackling the following challenges: (1) the interaction between agents is heavily dependent on road geometry and topology; (2) certain modalities of the surrounding agent are non-informative for the AV and can be disregarded; and (3) the diversity of multimodal prediction is limited by the maximum number of modalities. In this study, we propose Customizable Multimodal Transformer (CMT), a deep learning model which facilitates customizable multimodal trajectory prediction. First, inspired by the dependency between agent interaction and road geometry and topology, we propose that map information can be utilized to better understand agent interaction. Furthermore, we propose the concept of nodes of interest (NOI), which represents the area of interest of the AV. By manipulating the nodes in the NOI, CMT can generate customized prediction results where irrelevant modalities can be disregarded without compromising the safety of the AV, leading to reduced computational costs. Finally, we propose to enhance the diversity of multimodal prediction results through Gaussian mixture reduction via clustering (GMRC). Extensive experiments on nuScenes and Argoverse datasets demonstrate that CMT not only outperforms previous state-of-the-art models, but also exhibits great potential for reducing computational costs and improving inference speed for trajectory prediction of AVs. Code is available at https://github.com/Promisery/CMT.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"288 \",\"pages\":\"Article 128222\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-05-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425018421\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425018421","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Customizable multimodal trajectory prediction via nodes of interest selection for autonomous vehicles
To safely navigate through complex traffic scenarios, autonomous vehicles (AVs) must accurately predict the future trajectories of surrounding agents. Therefore, there has been a surge of interest in the problem of trajectory prediction for AVs. Building upon existing studies, we aim to push the boundaries of state-of-the-art research by tackling the following challenges: (1) the interaction between agents is heavily dependent on road geometry and topology; (2) certain modalities of the surrounding agent are non-informative for the AV and can be disregarded; and (3) the diversity of multimodal prediction is limited by the maximum number of modalities. In this study, we propose Customizable Multimodal Transformer (CMT), a deep learning model which facilitates customizable multimodal trajectory prediction. First, inspired by the dependency between agent interaction and road geometry and topology, we propose that map information can be utilized to better understand agent interaction. Furthermore, we propose the concept of nodes of interest (NOI), which represents the area of interest of the AV. By manipulating the nodes in the NOI, CMT can generate customized prediction results where irrelevant modalities can be disregarded without compromising the safety of the AV, leading to reduced computational costs. Finally, we propose to enhance the diversity of multimodal prediction results through Gaussian mixture reduction via clustering (GMRC). Extensive experiments on nuScenes and Argoverse datasets demonstrate that CMT not only outperforms previous state-of-the-art models, but also exhibits great potential for reducing computational costs and improving inference speed for trajectory prediction of AVs. Code is available at https://github.com/Promisery/CMT.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.