{"title":"自动驾驶的端到端时空注意模型","authors":"Ruijie Zhao, Yanxin Zhang, Zhiqing Huang, Chenkun Yin","doi":"10.1109/ITNEC48623.2020.9085185","DOIUrl":null,"url":null,"abstract":"In recent years, end-to-end autonomous driving has become an emerging research direction in the field of autonomous driving. This method attempts to map the road images collected by the vehicle camera to the decision control of the vehicle. We propose a spatiotemporal neural network model with a visual attention mechanism to predict vehicle decision control in an end-to-end manner. The model is composed of CNN and LSTM and can extract temporal and spatial features from road image sequences. The visual attention mechanism in the model helps the model to focus on important areas in the image. We evaluated the model in the open racing car simulator TORCS, and the experiments showed that our model is better at predicting driving decisions than the simple CNN model. In addition, the visual attention mechanism in the model is conducive to improving the performance of the end-to-end autonomous driving model.","PeriodicalId":235524,"journal":{"name":"2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"End-to-end Spatiotemporal Attention Model for Autonomous Driving\",\"authors\":\"Ruijie Zhao, Yanxin Zhang, Zhiqing Huang, Chenkun Yin\",\"doi\":\"10.1109/ITNEC48623.2020.9085185\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, end-to-end autonomous driving has become an emerging research direction in the field of autonomous driving. This method attempts to map the road images collected by the vehicle camera to the decision control of the vehicle. We propose a spatiotemporal neural network model with a visual attention mechanism to predict vehicle decision control in an end-to-end manner. The model is composed of CNN and LSTM and can extract temporal and spatial features from road image sequences. The visual attention mechanism in the model helps the model to focus on important areas in the image. We evaluated the model in the open racing car simulator TORCS, and the experiments showed that our model is better at predicting driving decisions than the simple CNN model. In addition, the visual attention mechanism in the model is conducive to improving the performance of the end-to-end autonomous driving model.\",\"PeriodicalId\":235524,\"journal\":{\"name\":\"2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITNEC48623.2020.9085185\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITNEC48623.2020.9085185","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
End-to-end Spatiotemporal Attention Model for Autonomous Driving
In recent years, end-to-end autonomous driving has become an emerging research direction in the field of autonomous driving. This method attempts to map the road images collected by the vehicle camera to the decision control of the vehicle. We propose a spatiotemporal neural network model with a visual attention mechanism to predict vehicle decision control in an end-to-end manner. The model is composed of CNN and LSTM and can extract temporal and spatial features from road image sequences. The visual attention mechanism in the model helps the model to focus on important areas in the image. We evaluated the model in the open racing car simulator TORCS, and the experiments showed that our model is better at predicting driving decisions than the simple CNN model. In addition, the visual attention mechanism in the model is conducive to improving the performance of the end-to-end autonomous driving model.