Multi-branch Deep Neural Model for Natural Language-Based Vehicle Retrieval

dc.contributor.authorShankaranarayan, N.
dc.contributor.authorKamath S․, S.
dc.date.accessioned2026-02-06T06:34:57Z
dc.date.issued2023
dc.description.abstractNatural language interfaces (NLIs) have seen tremendous popularity in recent times. The utility of natural language descriptions for identifying vehicles in city-scale smart traffic systems is an emerging problem that has received significant research interest. NL-based vehicle identification/retrieval can significantly improve existing systems’ usability and user-friendliness. In this paper, the problem of NL-based vehicle retrieval is explored, which focuses on the retrieval/identification of a unique vehicle from a single-view video given the vehicle’s natural language description. Natural language descriptions are leveraged to identify a specific target vehicle based on its visual features and environmental features such as trajectory and neighbours. We propose a multi-branch model that learns the target vehicle’s visual features, environmental features, and direction and uses the concatenated feature vector to calculate a similarity score by comparing it with the feature vector of the given natural language description, thus identifying the vehicle of interest. The Cityflow-NL dataset was used for the purpose of training/validation, and the performance was measured using MRR (Mean Reciprocal Rank). The proposed model achieved a standardised MRR score of 0.15, which is on par with state-of-the-art models. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
dc.identifier.citationLecture Notes in Networks and Systems, 2023, Vol.586 LNNS, , p. 603-613
dc.identifier.issn23673370
dc.identifier.urihttps://doi.org/10.1007/978-981-19-7867-8_48
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/29549
dc.publisherSpringer Science and Business Media Deutschland GmbH
dc.subjectNatural language processing
dc.subjectVehicle retrieval/identification
dc.subjectVision-based transformers
dc.titleMulti-branch Deep Neural Model for Natural Language-Based Vehicle Retrieval

Files