Jianhong Zou, Yihui Cui, Ting Zhao, Weihua Ouyang, Bei Luo, Qilie Liu
{"title":"Spatiotemporal Pyramid Aggregation and Graph Attention for Scene Perception and Tajectory Prediction","authors":"Jianhong Zou, Yihui Cui, Ting Zhao, Weihua Ouyang, Bei Luo, Qilie Liu","doi":"10.1109/ACAIT56212.2022.10137838","DOIUrl":null,"url":null,"abstract":"In the autonomous driving system, accurate scene perception and trajectory prediction are critical for collision avoidance and path planning of autonomous vehicles. This paper proposes a scene perception and trajectory prediction method based on graph attention mechanism to learn semantic and interaction information based on bird eye’s view (BEV) map. The method includes spatiotemporal pyramid network and graph attention network. The former uses spatiotemporal pyramid network to model the surrounding information to obtain scene features, and graph attention network models the interaction information of the surrounding traffic participants to obtain graph interactive features. Then, scene semantic features and graph interaction features are fused into a unified feature space to perform downstream pixel-level classification and trajectory prediction tasks. Compared with baseline method, the proposed method significantly improves the average classification accuracy and reduces the average error of trajectory prediction with high efficiency. Experimental results show that the proposed method has better performance and is more feasible for deployment in real-world automatic driving scenarios.","PeriodicalId":398228,"journal":{"name":"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACAIT56212.2022.10137838","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the autonomous driving system, accurate scene perception and trajectory prediction are critical for collision avoidance and path planning of autonomous vehicles. This paper proposes a scene perception and trajectory prediction method based on graph attention mechanism to learn semantic and interaction information based on bird eye’s view (BEV) map. The method includes spatiotemporal pyramid network and graph attention network. The former uses spatiotemporal pyramid network to model the surrounding information to obtain scene features, and graph attention network models the interaction information of the surrounding traffic participants to obtain graph interactive features. Then, scene semantic features and graph interaction features are fused into a unified feature space to perform downstream pixel-level classification and trajectory prediction tasks. Compared with baseline method, the proposed method significantly improves the average classification accuracy and reduces the average error of trajectory prediction with high efficiency. Experimental results show that the proposed method has better performance and is more feasible for deployment in real-world automatic driving scenarios.