CNN-GRU: Transforming image into sentence using GRU and attention mechanism

Saini, G.; Patil, N.

CNN-GRU: Transforming image into sentence using GRU and attention mechanism

dc.contributor.author	Saini, G.
dc.contributor.author	Patil, N.
dc.date.accessioned	2026-02-06T06:36:14Z
dc.date.issued	2021
dc.description.abstract	Recent advancement of the deep neural network has triggered great attention in both Natural Language Processing (NLP) and Computer Vision (CV). It provides an efficient way of understanding semantic and syntactic structure which can deal with complex task such as automatic image captioning. Image captioning methodology mainly based on the encoder-decoder approach. In the present work, we developed a CNN-GRU model using Convolutional Neural Network (CNN), Gated Recurrent Unit (GRU) and attention mechanism. Here VGG16 is used as an encoder, GRU and attention mechanism are used as a decoder. Our model has shown significant improvement compared to other state-of-art encoder-decoder models on the famous MSCOCO data set. Further, the time taken to train and test our model is two-third as compared to other similar models such as CNN-CNN and CNN-RNN. Â© Grenze Scientific Society, 2021.
dc.identifier.citation	12th International Conference on Advances in Computing, Control, and Telecommunication Technologies, ACT 2021, 2021, Vol.2021-August, , p. 487-493
dc.identifier.uri	https://doi.org/
dc.identifier.uri	https://idr.nitk.ac.in/handle/123456789/30318
dc.publisher	Grenze Scientific Society
dc.subject	Computer vision
dc.subject	Image captioning
dc.subject	Machine translation
dc.subject	Natural language processing
dc.subject	Video captioning
dc.title	CNN-GRU: Transforming image into sentence using GRU and attention mechanism

Collections

Conference Papers

CNN-GRU: Transforming image into sentence using GRU and attention mechanism

Files

Collections