Describing Image with Attention based GRU

dc.contributor.authorMallick, V.R.
dc.contributor.authorNaik, D.
dc.date.accessioned2026-02-06T06:35:56Z
dc.date.issued2021
dc.description.abstractGenerating descriptions for images are popular research topic in current world. Based on encoder-decoder model, CNN works as an encoder to encode the images and then passes it to decoder RNN as input to generate the image description in natural language sentences. LSTM is widely used as RNN decoder. Attention mechanism has also played an important role in this field by enhancing the object detection. Inspired by this recent advancement in this field of computer vision, we used GRU in place of LSTM as a decoder for our image captioning model. We incorporated attention mechanism with GRU decoder to enhance the precision of generated captions. GRU have lesser tensor operations in comparison to LSTM, hence it will be faster in training. © 2021 IEEE.
dc.identifier.citation2021 6th International Conference for Convergence in Technology, I2CT 2021, 2021, Vol., , p. -
dc.identifier.urihttps://doi.org/10.1109/I2CT51068.2021.9418171
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/30154
dc.publisherInstitute of Electrical and Electronics Engineers Inc.
dc.subjectAttention
dc.subjectBahdanau attention
dc.subjectConvolutional Neural Network [CNN]
dc.subjectGated Recurrent Unit [GRU]
dc.subjectInceptionV3
dc.subjectLong Short Term Memory [LSTM]
dc.titleDescribing Image with Attention based GRU

Files