KCe_Dalab@maponsms-Fire2018: Effective word and character-based features for multilingual author profiling

dc.contributor.authorSharmila Devi, V.
dc.contributor.authorSubramanian, S.
dc.contributor.authorRavikumar, G.
dc.contributor.authorAnand Kumar, M.
dc.date.accessioned2026-02-06T06:38:19Z
dc.date.issued2018
dc.description.abstractThis paper illustrates the work on identification of gender and age-group in Multilingual Author Profiling on SMS messages (MAPonSMS) shared task conducted in the Forum for Information Retrieval and Evaluation (FIRE 2018). To develop the Multilingual Author profiling system, the organizers released the training corpus which includes multilingual (Roman Urdu and English) SMS messages and its corresponding profiles. In gender identification, a profile may be either male or female. The author's age-group fall into one of the three categories: 15-19, 20-24, 25-xx. We have developed the author profiling system 1 using the word and character-based Term Frequency & Inverse Document Frequency (TFIDF) features and classify with Support Vector Machine classifier. The proposed system achieved the State-of-Art performance in the multilingual author profiling on SMS task. The accuracy obtained for identification of age-group is 65% and for gender, it is 87%. The performance is also evaluated jointly where the accuracy gained is 57%. We also experimented with the system by changing different parameters and report the cross-validation accuracy. © 2018 CEUR-WS. All Rights Reserved.
dc.identifier.citationCEUR Workshop Proceedings, 2018, Vol.2266, , p. 213-222
dc.identifier.issn16130073
dc.identifier.urihttps://doi.org/
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/31606
dc.publisherCEUR-WS ceurws@sunsite.informatik.rwth-aachen.de
dc.subjectAuthor profiling
dc.subjectMachine Learning
dc.subjectMultilingual SMS
dc.subjectSupport Vector Machine
dc.subjectTFIDF
dc.subjectWord and Character-based features
dc.titleKCe_Dalab@maponsms-Fire2018: Effective word and character-based features for multilingual author profiling

Files