Yuki Mori, Hiroshi Fukui, Tsubasa Hirakawa, Jo Nishiyama, Takayoshi Yamashita, H. Fujiyoshi
{"title":"注意神经婴儿语:驾驶时危险因素的说明","authors":"Yuki Mori, Hiroshi Fukui, Tsubasa Hirakawa, Jo Nishiyama, Takayoshi Yamashita, H. Fujiyoshi","doi":"10.1109/ITSC.2019.8917187","DOIUrl":null,"url":null,"abstract":"Driving has various risk factors, including the possibility of traffic accidents involving pedestrians and/or oncoming vehicles. A driver assistance system that can prevent traffic accidents must be able to get the driver ' s attention. A practical solution for attention attraction should involve caption generation from in-vehicle images. Although a number of approaches for caption generation with deep neural networks have been proposed, they are inadequate for the specific risk factors while driving. The reason is that conventional captioning methods focus on not these factors but the entirety of an image. To tackle this problem, we first created a dataset to attract attention, one that considers risk factors during driving. Furthermore, we propose an image captioning method for the assistance system. Our method is based on neural baby talk and introduces an attention mask focusing on risk factors in an image. The mask enables our model to generate captions on each factor. Experimental results with our created dataset show that our method can generate captions for ideal attention attraction.","PeriodicalId":6717,"journal":{"name":"2019 IEEE Intelligent Transportation Systems Conference (ITSC)","volume":"2 1","pages":"4317-4322"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Attention Neural Baby Talk: Captioning of Risk Factors while Driving\",\"authors\":\"Yuki Mori, Hiroshi Fukui, Tsubasa Hirakawa, Jo Nishiyama, Takayoshi Yamashita, H. Fujiyoshi\",\"doi\":\"10.1109/ITSC.2019.8917187\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Driving has various risk factors, including the possibility of traffic accidents involving pedestrians and/or oncoming vehicles. A driver assistance system that can prevent traffic accidents must be able to get the driver ' s attention. A practical solution for attention attraction should involve caption generation from in-vehicle images. Although a number of approaches for caption generation with deep neural networks have been proposed, they are inadequate for the specific risk factors while driving. The reason is that conventional captioning methods focus on not these factors but the entirety of an image. To tackle this problem, we first created a dataset to attract attention, one that considers risk factors during driving. Furthermore, we propose an image captioning method for the assistance system. Our method is based on neural baby talk and introduces an attention mask focusing on risk factors in an image. The mask enables our model to generate captions on each factor. Experimental results with our created dataset show that our method can generate captions for ideal attention attraction.\",\"PeriodicalId\":6717,\"journal\":{\"name\":\"2019 IEEE Intelligent Transportation Systems Conference (ITSC)\",\"volume\":\"2 1\",\"pages\":\"4317-4322\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE Intelligent Transportation Systems Conference (ITSC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITSC.2019.8917187\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Intelligent Transportation Systems Conference (ITSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITSC.2019.8917187","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Attention Neural Baby Talk: Captioning of Risk Factors while Driving
Driving has various risk factors, including the possibility of traffic accidents involving pedestrians and/or oncoming vehicles. A driver assistance system that can prevent traffic accidents must be able to get the driver ' s attention. A practical solution for attention attraction should involve caption generation from in-vehicle images. Although a number of approaches for caption generation with deep neural networks have been proposed, they are inadequate for the specific risk factors while driving. The reason is that conventional captioning methods focus on not these factors but the entirety of an image. To tackle this problem, we first created a dataset to attract attention, one that considers risk factors during driving. Furthermore, we propose an image captioning method for the assistance system. Our method is based on neural baby talk and introduces an attention mask focusing on risk factors in an image. The mask enables our model to generate captions on each factor. Experimental results with our created dataset show that our method can generate captions for ideal attention attraction.