Shankaranarayan, N.Kamath S․, S.2026-02-062023Lecture Notes in Networks and Systems, 2023, Vol.586 LNNS, , p. 603-61323673370https://doi.org/10.1007/978-981-19-7867-8_48https://idr.nitk.ac.in/handle/123456789/29549Natural language interfaces (NLIs) have seen tremendous popularity in recent times. The utility of natural language descriptions for identifying vehicles in city-scale smart traffic systems is an emerging problem that has received significant research interest. NL-based vehicle identification/retrieval can significantly improve existing systems’ usability and user-friendliness. In this paper, the problem of NL-based vehicle retrieval is explored, which focuses on the retrieval/identification of a unique vehicle from a single-view video given the vehicle’s natural language description. Natural language descriptions are leveraged to identify a specific target vehicle based on its visual features and environmental features such as trajectory and neighbours. We propose a multi-branch model that learns the target vehicle’s visual features, environmental features, and direction and uses the concatenated feature vector to calculate a similarity score by comparing it with the feature vector of the given natural language description, thus identifying the vehicle of interest. The Cityflow-NL dataset was used for the purpose of training/validation, and the performance was measured using MRR (Mean Reciprocal Rank). The proposed model achieved a standardised MRR score of 0.15, which is on par with state-of-the-art models. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.Natural language processingVehicle retrieval/identificationVision-based transformersMulti-branch Deep Neural Model for Natural Language-Based Vehicle Retrieval