Mallick, V.R.Naik, D.2026-02-0620212021 6th International Conference for Convergence in Technology, I2CT 2021, 2021, Vol., , p. -https://doi.org/10.1109/I2CT51068.2021.9418171https://idr.nitk.ac.in/handle/123456789/30154Generating descriptions for images are popular research topic in current world. Based on encoder-decoder model, CNN works as an encoder to encode the images and then passes it to decoder RNN as input to generate the image description in natural language sentences. LSTM is widely used as RNN decoder. Attention mechanism has also played an important role in this field by enhancing the object detection. Inspired by this recent advancement in this field of computer vision, we used GRU in place of LSTM as a decoder for our image captioning model. We incorporated attention mechanism with GRU decoder to enhance the precision of generated captions. GRU have lesser tensor operations in comparison to LSTM, hence it will be faster in training. © 2021 IEEE.AttentionBahdanau attentionConvolutional Neural Network [CNN]Gated Recurrent Unit [GRU]InceptionV3Long Short Term Memory [LSTM]Describing Image with Attention based GRU