Attention Neural Baby Talk: Captioning of Risk Factors While Driving
- Y. Mori, H. Fukui, T. Hirakawa, J. Nishiyama, T. Yamashita, H. Fujiyoshi
- The 22nd IEEE International Conference on Intelligent Transportation Systems (ITSC), 2019
Download: PDF (English)
Driving has various risk factors, including the possibility of traffic accidents involving pedestrians and/or oncoming vehicles. A driver assistance system that can prevent traffic accidents must be able to get the driver’s attention to enable better safety. A practical solution for attention attraction should involve caption generation from in-vehicle images. Although a number of approaches for caption generation with deep neural networks have been proposed, they are inadequate for the specific risk factors while driving. The reason is that conventional captioning methods focus on not these factors but the entirety of an image. To tackle this problem, we first created a dataset to attract attention, one that considers risk factors during driving. Furthermore, we propose an image captioning method for the assistance system. Our method is based on neural baby talk and introduces an attention mask focusing on risk factors in an image. The mask enables our model to generate captions on each factor. Experimental results with our created dataset show that our method can generate captions for ideal attention attraction.