Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 6 of 6
  • Item
    Overview of the track on HASOC-offensive Language Identification-DravidianCodeMix
    (CEUR-WS, 2020) Chakravarthi, B.R.; Anand Kumar, M.; Mccrae, J.P.; Premjith, B.; Padannayil, K.P.; Mandl, T.
    We present the results and main findings of the HASOC-Offensive Language Identification on code mixed Dravidian languages. The task featured two tasks. Task 1 is about offensive language identification in Malayalam language where the comment were written in both native script and Latin script. Task 2 is about offensive language identification in Tamil and Malayalam languages where the comments were written in Latin script (non-native script). For both the task, given a comment the participants should develop a system to classify the text into offensive or not-offensive. In total 96 participants participated and 12 participants submitted the papers. In this paper, we present the task, data, the results and discuss the system submission and methods used by participants. © 2020 Copyright for this paper by its authors.
  • Item
    Findings of the Shared Task on Machine Translation in Dravidian languages
    (Association for Computational Linguistics (ACL), 2021) Chakravarthi, B.R.; Priyadharshini, R.; Banerjee, S.; Saldanha, R.; Mccrae, J.P.; Anand Kumar, M.; Krishnamurthy, P.; Johnson, M.
    This paper presents an overview of the shared task on machine translation of Dravidian languages. We presented the shared task results at the EACL 2021 workshop on Speech and Language Technologies for Dravidian Languages. This paper describes the datasets used, the methodology used for the evaluation of participants, and the experiments’ overall results. As a part of this shared task, we organized four sub-tasks corresponding to machine translation of the following language pairs: English to Tamil, English to Malayalam, English to Telugu and Tamil to Telugu which are available at https://competitions.codalab.org/competitions/27650. We provided the participants with training and development datasets to perform experiments, and the results were evaluated on unseen test data. In total, 46 research groups participated in the shared task and 7 experimental runs were submitted for evaluation. We used BLEU scores for assessment of the translations. ©2021 Association for Computational Linguistics
  • Item
    Findings of the Shared Task on Offensive Language Identification in Tamil, Malayalam, and Kannada
    (Association for Computational Linguistics (ACL), 2021) Chakravarthi, B.R.; Priyadharshini, R.; Jose, N.; Anand Kumar, M.; Mandl, T.; Kumaresan, P.K.; Ponnusamy, R.; LekshmiAmmal, R.L.; Mccrae, J.P.; Sherly, E.
    Detecting offensive language in social media in local languages is critical for moderating user-generated content. Thus, the field of offensive language identification for under-resourced languages like Tamil, Malayalam and Kannada is of essential importance. As user-generated content is often code-mixed and not well studied for under-resourced languages, it is imperative to create resources and conduct benchmark studies to encourage research in under-resourced Dravidian languages. We created a shared task on offensive language detection in Dravidian languages. We summarize the dataset for this challenge which are openly available at https://competitions.codalab.org/competitions/27654, and present an overview of the methods and the results of the competing systems. ©2021 Association for Computational Linguistics
  • Item
    Overview of the HASOC-DravidianCodeMix Shared Task on Offensive Language Detection in Tamil and Malayalam
    (CEUR-WS, 2021) Chakravarthi, B.R.; Kumaresan, P.K.; Sakuntharaj, R.; Anand Kumar, M.; Thavareesan, S.; Premjith, B.; Sreelakshmi, K.; Subalalitha, S.C.; Mccrae, J.P.; Mandl, T.
    We present the results of HASOC-Dravidian-CodeMix shared task1 held at FIRE 2021, a track on offensive language identification for Dravidian languages in Code-Mixed Text in this paper. This paper will detail the task, its organisation, and the submitted systems. The identification of offensive language was viewed as a classification task. For this, 16 teams participated in identifying offensive language from Tamil-English code mixed data, 11 teams for Malayalam-English code mixed data and 14 teams for Tamil data. The teams detected offensive language using various machine learning and deep learning classification models. This paper has analysed those benchmark systems to find out how well they accommodate a code-mixed scenario in Dravidian languages, focusing on Tamil and Malayalam. © 2021 Copyright for this paper by its authors.
  • Item
    Findings of Shared Task on Offensive Language Identification in Tamil and Malayalam
    (Association for Computing Machinery, 2021) Kumaresan, P.K.; Premjith; Sakuntharaj, R.; Thavareesan, S.; Subalalitha, S.; Anand Kumar, M.; Chakravarthi, B.R.; Mccrae, J.P.
    We present the results of HASOC-Dravidian-CodeMix shared task1 held at FIRE 2021, a track on offensive language identification for Dravidian languages in Code-Mixed Text in this paper. This paper will detail the task, its organisation, and the submitted systems. The identification of offensive language was viewed as a classification task. For this, 16 teams participated in identifying offensive language from Tamil-English code mixed data, 11 teams for Malayalam-English code mixed data and 14 teams for Tamil data. The teams detected offensive language using various machine learning and deep learning classification models. This paper has analysed those benchmark systems to find out how well they accommodate a code-mixed scenario in Dravidian languages, focusing on Tamil and Malayalam. © 2021 Owner/Author.
  • Item
    Overview of the Shared Task on Machine Translation in Dravidian Languages
    (Association for Computational Linguistics (ACL), 2022) Anand Kumar, A.M.; Hegde, A.; Banerjee, S.; Chakravarthi, B.R.; Priyadarshini, R.; Shashirekha, H.L.; Mccrae, J.P.
    This paper presents an outline of the shared task on translation of under-resourced Dravidian languages at DravidianLangTech-2022 workshop to be held jointly with ACL 2022. A description of the datasets used, approach taken for analysis of submissions and the results have been illustrated in this paper. Five sub-tasks organized as a part of the shared task include the following translation pairs: Kannada to Tamil, Kannada to Telugu, Kannada to Sanskrit, Kannada to Malayalam and Kannada to Tulu. Training, development and test datasets were provided to all participants and results were evaluated on the gold standard datasets. A total of 16 research groups participated in the shared task and a total of 12 submission runs were made for evaluation. Bilingual Evaluation Understudy (BLEU) score was used for evaluation of the translations. © 2022 Association for Computational Linguistics.