Congcong Liu, Y. Chen, L. Tai, Haoyang Ye, Ming Liu, Bertram E. Shi
{"title":"A gaze model improves autonomous driving","authors":"Congcong Liu, Y. Chen, L. Tai, Haoyang Ye, Ming Liu, Bertram E. Shi","doi":"10.1145/3314111.3319846","DOIUrl":null,"url":null,"abstract":"End-to-end behavioral cloning trained by human demonstration is now a popular approach for vision-based autonomous driving. A deep neural network maps drive-view images directly to steering commands. However, the images contain much task-irrelevant data. Humans attend to behaviorally relevant information using saccades that direct gaze towards important areas. We demonstrate that behavioral cloning also benefits from active control of gaze. We trained a conditional generative adversarial network (GAN) that accurately predicts human gaze maps while driving in both familiar and unseen environments. We incorporated the predicted gaze maps into end-to-end networks for two behaviors: following and overtaking. Incorporating gaze information significantly improves generalization to unseen environments. We hypothesize that incorporating gaze information enables the network to focus on task critical objects, which vary little between environments, and ignore irrelevant elements in the background, which vary greatly.","PeriodicalId":161901,"journal":{"name":"Proceedings of the 11th ACM Symposium on Eye Tracking Research & Applications","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th ACM Symposium on Eye Tracking Research & Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3314111.3319846","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23
Abstract
End-to-end behavioral cloning trained by human demonstration is now a popular approach for vision-based autonomous driving. A deep neural network maps drive-view images directly to steering commands. However, the images contain much task-irrelevant data. Humans attend to behaviorally relevant information using saccades that direct gaze towards important areas. We demonstrate that behavioral cloning also benefits from active control of gaze. We trained a conditional generative adversarial network (GAN) that accurately predicts human gaze maps while driving in both familiar and unseen environments. We incorporated the predicted gaze maps into end-to-end networks for two behaviors: following and overtaking. Incorporating gaze information significantly improves generalization to unseen environments. We hypothesize that incorporating gaze information enables the network to focus on task critical objects, which vary little between environments, and ignore irrelevant elements in the background, which vary greatly.