Browsing by Author "Jawahar, C.V."

Now showing 1 - 3 of 3

A multi-space approach to zero-shot object detection
(Institute of Electrical and Electronics Engineers Inc., 2020) Gupta, D.; Anantharaman, A.; Mamgain, N.; Kamath Sâ€¤, S.; Balasubramanian, V.N.; Jawahar, C.V.
Object detection has been at the forefront for higher level vision tasks such as scene understanding and contextual reasoning. Therefore, solving object detection for a large number of visual categories is paramount. Zero-Shot Object Detection (ZSD) - where training data is not available for some of the target classes - provides semantic scalability to object detection and reduces dependence on large amount of annotations, thus enabling a large number of applications in real-life scenarios. In this paper, we propose a novel multi-space approach to solve ZSD where we combine predictions obtained in two different search spaces. We learn the projection of visual features of proposals to the semantic embedding space and class labels in the semantic embedding space to visual space. We predict similarity scores in the individual spaces and combine them. We present promising results on two datasets, PASCAL VOC and MS COCO. We further discuss the problem of hubness and show that our approach alleviates hubness with a performance superior to previously proposed methods. Â© 2020 IEEE.
Multi-label annotation of music
(2015) Ahsan, H.; Kumar, V.; Jawahar, C.V.
Automatic annotation of an audio or a music piece with multiple labels helps in understanding the composition of a music. Such meta-level information can be very useful in applications such as music transcription, retrieval, organization and personalization. In this work, we formulate the problem of annotation as multi-label classification which is considerably different from that of a popular single (binary or multi-class) label classification. We employ both the nearest neighbour and max-margin (SVM) formulations for the automatic annotation. We consider K-NN and SVM that are adapted for multi-label classification using one-vs-rest strategy and a direct multi-label classification formulation using ML-KNN and M3L. In the case of music, often the signatures of the labels (e.g. instruments and vocal signatures) are fused in the features. We therefore propose a simple feature augmentation technique based on non-negative matrix factorization (NMF) with an intuition to decompose a music piece into its constituent components. We conducted our experiments on two data sets - Indian classical instruments dataset and Emotions dataset [1], and validate the methods. � 2015 IEEE.
Multi-label annotation of music
(Institute of Electrical and Electronics Engineers Inc., 2015) Ahsan, H.; Kumar, V.; Jawahar, C.V.
Automatic annotation of an audio or a music piece with multiple labels helps in understanding the composition of a music. Such meta-level information can be very useful in applications such as music transcription, retrieval, organization and personalization. In this work, we formulate the problem of annotation as multi-label classification which is considerably different from that of a popular single (binary or multi-class) label classification. We employ both the nearest neighbour and max-margin (SVM) formulations for the automatic annotation. We consider K-NN and SVM that are adapted for multi-label classification using one-vs-rest strategy and a direct multi-label classification formulation using ML-KNN and M3L. In the case of music, often the signatures of the labels (e.g. instruments and vocal signatures) are fused in the features. We therefore propose a simple feature augmentation technique based on non-negative matrix factorization (NMF) with an intuition to decompose a music piece into its constituent components. We conducted our experiments on two data sets - Indian classical instruments dataset and Emotions dataset [1], and validate the methods. Â© 2015 IEEE.