Overlapping word removal is all you need: revisiting data imbalance in hope speech detection

dc.contributor.authorRamakrishnaIyer LekshmiAmmal, H.
dc.contributor.authorRavikiran, M.
dc.contributor.authorNisha, G.
dc.contributor.authorBalamuralidhar, N.
dc.contributor.authorMadhusoodanan, A.
dc.contributor.authorAnand Kumar, A.K.
dc.contributor.authorChakravarthi, B.R.
dc.date.accessioned2026-02-04T12:25:43Z
dc.date.issued2024
dc.description.abstractHope speech detection is a new task for finding and highlighting positive comments or supporting content from user-generated social media comments. For this task, we have used a Shared Task multilingual dataset on Hope Speech Detection for Equality, Diversity, and Inclusion (HopeEDI) for three languages English, code-switched Tamil and Malayalam. In this paper, we present deep learning techniques using context-aware string embeddings for word representations and Recurrent Neural Network (RNN) and pooled document embeddings for text representation. We have evaluated and compared the three models for each language with different approaches. Our proposed methodology works fine and achieved higher performance than baselines. The highest weighted average F-scores of 0.93, 0.58, and 0.84 are obtained on the task organisers{'} final evaluation test set. The proposed models are outperforming the baselines by 3{\%}, 2{\%} and 11{\%} in absolute terms for English, Tamil and Malayalam respectively. © 2023 Informa UK Limited, trading as Taylor & Francis Group.
dc.identifier.citationJournal of Experimental and Theoretical Artificial Intelligence, 2024, 36, 8, pp. 1837-1859
dc.identifier.issn0952813X
dc.identifier.urihttps://doi.org/10.1080/0952813X.2023.2166130
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/21522
dc.publisherTaylor and Francis Ltd.
dc.subjectClassification (of information)
dc.subjectModeling languages
dc.subjectRecurrent neural networks
dc.subjectSpeech recognition
dc.subjectText processing
dc.subjectData imbalance
dc.subjectEmbeddings
dc.subjectFocal loss
dc.subjectHope speech detection
dc.subjectLanguage model
dc.subjectMalayalams
dc.subjectSpeech detection
dc.subjectText classification
dc.subjectUser-generated
dc.subjectWord removals
dc.titleOverlapping word removal is all you need: revisiting data imbalance in hope speech detection

Files

Collections