ScalarLab@TRAC2024: Exploring Machine Learning Techniques for Identifying Potential Offline Harm in Multilingual Commentaries

Anagha, H.C.; Krishna, S.M.; Jha, S.S.; Rao, V.T.; Anand Kumar, M.

ScalarLab@TRAC2024: Exploring Machine Learning Techniques for Identifying Potential Offline Harm in Multilingual Commentaries

dc.contributor.author	Anagha, H.C.
dc.contributor.author	Krishna, S.M.
dc.contributor.author	Jha, S.S.
dc.contributor.author	Rao, V.T.
dc.contributor.author	Anand Kumar, M.
dc.date.accessioned	2026-02-06T06:34:06Z
dc.date.issued	2024
dc.description.abstract	The objective of the shared task, Offline Harm Potential Identification (HarmPot-ID), is to build models to predict the offline harm potential of social media texts. "Harm potential" is defined as the ability of an online post or comment to incite offline physical harm such as murder, arson, riot, rape, etc. The first subtask was to predict the level of harm potential, and the second was to identify the group to which this harm was directed towards. This paper details our submissions for the shared task that includes a cascaded SVM model, an XGBoost model, and a TF-IDF weighted Word2Vec embedding-supported SVM model. Our system ranked 4th in the first subtask and 3rd in the second. Several other models that were explored have also been detailed. Â© 2024 ELRA Language Resource Association.
dc.identifier.citation	TRAC 2024: 4th Workshop on Threat, Aggression and Cyberbullying at LREC-COLING 2024 - Workshop Proceedings, 2024, Vol., , p. 32-36
dc.identifier.uri	https://doi.org/
dc.identifier.uri	https://idr.nitk.ac.in/handle/123456789/29055
dc.publisher	European Language Resources Association (ELRA)
dc.subject	Harm Potential
dc.subject	HarmPot
dc.subject	Offline harm
dc.subject	Offline Harm
dc.subject	Text classification
dc.subject	TF-IDF
dc.subject	weighted word embeddings
dc.title	ScalarLab@TRAC2024: Exploring Machine Learning Techniques for Identifying Potential Offline Harm in Multilingual Commentaries

Collections

Conference Papers

ScalarLab@TRAC2024: Exploring Machine Learning Techniques for Identifying Potential Offline Harm in Multilingual Commentaries

Files

Collections