{"title":"基于空间交互变压器网络的行人轨迹预测","authors":"Tong Su, Yu Meng, Yan Xu","doi":"10.1109/ivworkshops54471.2021.9669249","DOIUrl":null,"url":null,"abstract":"As a core technology of the autonomous driving system, pedestrian trajectory prediction can significantly enhance the function of active vehicle safety and reduce road traffic injuries. In traffic scenes, when encountering with oncoming people, pedestrians may make sudden turns or stop immediately, which often leads to complicated trajectories. To predict such unpredictable trajectories, we can gain insights into the interaction between pedestrians. In this paper, we present a novel generative method named Spatial Interaction Transformer (SIT), which learns the spatio-temporal correlation of pedestrian trajectories through attention mechanisms. Furthermore, we introduce the conditional variational autoencoder (CVAE) [1] framework to model the future latent motion states of pedestrians. In particular, the experiments based on large-scale traffic dataset nuScenes [2] show that SIT has an outstanding performance than state-of-the-art (SOTA) methods. Experimental evaluation on the challenging ETH [3] and UCY [4] datasets confirms the robustness of our proposed model.","PeriodicalId":256905,"journal":{"name":"2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Pedestrian Trajectory Prediction via Spatial Interaction Transformer Network\",\"authors\":\"Tong Su, Yu Meng, Yan Xu\",\"doi\":\"10.1109/ivworkshops54471.2021.9669249\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a core technology of the autonomous driving system, pedestrian trajectory prediction can significantly enhance the function of active vehicle safety and reduce road traffic injuries. In traffic scenes, when encountering with oncoming people, pedestrians may make sudden turns or stop immediately, which often leads to complicated trajectories. To predict such unpredictable trajectories, we can gain insights into the interaction between pedestrians. In this paper, we present a novel generative method named Spatial Interaction Transformer (SIT), which learns the spatio-temporal correlation of pedestrian trajectories through attention mechanisms. Furthermore, we introduce the conditional variational autoencoder (CVAE) [1] framework to model the future latent motion states of pedestrians. In particular, the experiments based on large-scale traffic dataset nuScenes [2] show that SIT has an outstanding performance than state-of-the-art (SOTA) methods. Experimental evaluation on the challenging ETH [3] and UCY [4] datasets confirms the robustness of our proposed model.\",\"PeriodicalId\":256905,\"journal\":{\"name\":\"2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops)\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ivworkshops54471.2021.9669249\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ivworkshops54471.2021.9669249","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Pedestrian Trajectory Prediction via Spatial Interaction Transformer Network
As a core technology of the autonomous driving system, pedestrian trajectory prediction can significantly enhance the function of active vehicle safety and reduce road traffic injuries. In traffic scenes, when encountering with oncoming people, pedestrians may make sudden turns or stop immediately, which often leads to complicated trajectories. To predict such unpredictable trajectories, we can gain insights into the interaction between pedestrians. In this paper, we present a novel generative method named Spatial Interaction Transformer (SIT), which learns the spatio-temporal correlation of pedestrian trajectories through attention mechanisms. Furthermore, we introduce the conditional variational autoencoder (CVAE) [1] framework to model the future latent motion states of pedestrians. In particular, the experiments based on large-scale traffic dataset nuScenes [2] show that SIT has an outstanding performance than state-of-the-art (SOTA) methods. Experimental evaluation on the challenging ETH [3] and UCY [4] datasets confirms the robustness of our proposed model.