A robust approach to open vocabulary image retrieval with deep convolutional neural networks and transfer learning

Thumbnail Image

Date

2018

Authors

Padmakumar, V.
Ranga, R.
Elluru, S.
Sowmya, Kamath S.

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Enabling computer systems to respond to conversational human language is a challenging problem with wideranging applications in the field of robotics and human computer interaction. Specifically, in image searches, humans tend to describe objects in fine-grained detail like color or company, for which conventional retrieval algorithms have shown poor performance. In this paper, a novel approach for open vocabulary image retrieval, capable of selecting the correct candidate image from among a set of distractions given a query in natural language form, is presented. Our methodology focuses on generating a robust set of image-text projections capable of accurately representing any image, with an objective of achieving high recall. To this end, an ensemble of classifiers is trained on ImageNet for representing high-resolution objects, Cifar 100 for smaller resolution images of objects and Caltech 256 for challenging views of everyday objects, for generating category-based projections. In addition to category based projections, we also make use of an image captioning model trained on MS COCO and Google Image Search (GISS) to capture additional semantic/latent information about the candidate images. To facilitate image retrieval, the natural language query and projection results are converted to a common vector representation using word embeddings, with which query-image similarity is computed. The proposed model when benchmarked on the RefCoco dataset, achieved an accuracy of 68.8%, while retrieving semantically meaningful candidate images. � 2018 Pacific Neighborhood Consortium (PNC).

Description

Keywords

Citation

Proceedings of the 2018 Pacific Neighborhood Consortium Annual Conference and Joint Meetings: Human Rights in Cyberspace, PNC 2018, 2018, Vol., , pp.106-112

Endorsement

Review

Supplemented By

Referenced By