Effect of Batch Normalization and Stacked LSTMs on Video Captioning

Sarathi, V.; Mujumdar, A.; Naik, D.

Effect of Batch Normalization and Stacked LSTMs on Video Captioning

Date

2021

Authors

Sarathi, V.

Mujumdar, A.

Naik, D.

Publisher

Institute of Electrical and Electronics Engineers Inc.

Abstract

Integration of visual content with natural language for generating images or video description has been a challenging task for many years. Recent research in image captioning using Long Short term memory (LSTM) recently has motivated its possible application in video captioning where a video is converted into an array of frames, or images, and this array along with the captions for the video are used to train the LSTM network to associate the video with sentences. However very little is known about using fine tuning techniques such as batch normalization or Stacked LSTMs models in video captioning and how it affects the performance of the model.For this project, we want to compare the performance of the base model described in [1] with batch normalization and stacked LSTMs with base model as our reference. Â© 2021 IEEE.

Keywords

Attention, Bidirectional LSTM, Video Captioning

Citation

Proceedings - 5th International Conference on Computing Methodologies and Communication, ICCMC 2021, 2021, Vol., , p. 820-825

URI

https://doi.org/10.1109/ICCMC51019.2021.9418036
https://idr.nitk.ac.in/handle/123456789/30151

Collections

Conference Papers

Full item page

Effect of Batch Normalization and Stacked LSTMs on Video Captioning

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By