Please use this identifier to cite or link to this item:
Title: Intrinsic evaluation for english�tamil bilingual word embeddings
Authors: Sanjanasri, J.P.
Menon, V.K.
Rajendran, S.
Soman, K.P.
Anand, Kumar, M.
Issue Date: 2020
Citation: Advances in Intelligent Systems and Computing, 2020, Vol.910, , pp.39-51
Abstract: Despite the growth of bilingual word embeddings, there is no work done so far, for directly evaluating them for English�Tamil language pair. In this paper, we present a data resource and evaluation for the English�Tamil bilingual word vector model. In this paper, we present dataset and the evaluation paradigm for English�Tamil bilingual language pair. This dataset contains words that covers a range of concepts that occur in natural language. The dataset is scored based on the similarity rather than association or relatedness. Hence, the word pairs that are associated but not literally similar have a low rating. The measures are quantified further to ensure consistency in the dataset, mimicking the cognitive phenomena. Henceforth, the dataset can be used by non-native speakers, with minimal effort. We also present some inferences and insights into the semantics captured by word vectors and human cognition. � Springer Nature Singapore Pte Ltd. 2020.
Appears in Collections:2. Conference Papers

Files in This Item:
There are no files associated with this item.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.