Multilingual Models for Sentiment and Abusive Language Detection for Dravidian Languages
No Thumbnail Available
Date
2023
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Incoma Ltd
Abstract
This work delves into the realm of abusive comment detection and sentiment analysis within code-mixed content, focusing specifically on Dravidian languages. The languages covered include Tulu, and Tamil. For this investigation, TFIDF-based Long Short-Term Memory (LSTM) and Hierarchical Attention Networks (HAN) are employed as the analytical tools. Interestingly, the research highlights the prevalence of traditional TF-IDF techniques over Hierarchical Attention models in both sentiment analysis and the identification of abusive language across the diverse linguistic landscape encompassing Tulu and Tamil. Of note is the Tulu sentiment analysis system, which demonstrates remarkable prowess in handling Positive and Neutral sentiments. In contrast, the sentiment analysis system tailored for Tamil exhibits comparatively lower performance levels. This discrepancy underscores the critical need for well-balanced datasets and intensified research endeavors to enhance the accuracy of sentiment analysis, particularly in the context of the Tamil language. Shifting focus to abusive language detection, the TF-IDF-LSTM models consistently outperform the Hierarchical Attention models. Intriguingly, the mixed models exhibit particular strength in classifying categories like "Homophobia" and "Xenophobia." This intriguing outcome accentuates the value of incorporating both code-mixed and original script data, presenting novel avenues for advancing social media analysis research in diverse linguistic scenarios involving the Dravidian languages. © 2023 LTEDI 2023 - 3rd Workshop on Language Technology for Equality, Diversity and Inclusion, associated with the 14th International Conference on Recent Advances in Natural Language Processing, RANLP 2023 - Proceedings. All rights reserved.
Description
Keywords
Citation
LTEDI 2023 - 3rd Workshop on Language Technology for Equality, Diversity and Inclusion, associated with the 14th International Conference on Recent Advances in Natural Language Processing, RANLP 2023 - Proceedings, 2023, Vol., , p. 17-24
