Image Captioning in Near Future from Vehicle Camera Images and Motion Information
- Yuki Mori, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi
- IEEE Intelligent Vehicles Symposium, 2021
Download: PDF (English)
Image captioning is a task to generate a sentence explaining an input image. In autonomous driving, image captioning is expected for providing linguistic explanations of autonomous driving control decision-making because it can reduce the psychological burden on passengers and prevent accidents. Current image-captioning methods are limited to generating a caption for an input image and not generating captions for events in the near future. It is important to generate captions for any event that will happen in the near future to prevent accidents and alert passengers. Therefore, we created a task to generate an explanatory sentence of near-future events using images observed from past to present. For this task, we propose a near-future image-captioning method suitable for in-vehicle camera images. Our experiments using the Berkeley Deep Drive eXplanation Dataset showed that the proposed method can appropriately generate captions for near-future events.