Please use this identifier to cite or link to this item:
Title: Towards sentiment orientation data set enrichment
Authors: Sankaranarayanan, S.
Ingale, D.
Bhambhu, R.
Chandrasekaran, K.
Issue Date: 2016
Citation: ACM International Conference Proceeding Series, 2016, Vol.04-05-March-2016, , pp.-
Abstract: Sentiment orientation data sets referred to variously as affective word lists, opinion lexicons, sentiment lexicons, emotion lexicons or sentiment dictionaries contain a list of words scored for the degree of positive and negative emotion they exhibit. Although these lists have been used extensively for the sentiment analysis of text data, they contain a limited number of words that are often inadequate for data obtained from modern text sources dominated by the inuence of social media that has resulted in the creation and coining of new words on a regular basis. In an effort to enrich these data sets with new words, we propose two methods. The first method involves the sentiment analysis of portmanteau words. We have hypothesized that the sentiment score of a portmanteau word; which is a combination of two (or more) words and their meanings into a single new word; can be determined as a function of the sentiment scores of its component words. Regression analysis has been used to determine this functional relationship and several cases arising from the above have been evaluated on a data set constructed from SentiWordNet. The second method is an in situ approach for sentiment discovery for unknown words that uses labeled tweets and words from the sentiment orientation data set as inputs to discover the sentiment score of the unknown word. In order to validate the resultant score, we have also used a novel validation-feedback mechanism akin to crossvalidation. Both these methods produce acceptable levels of accuracy proving that they can be implemented in practice. � 2016 ACM.
Appears in Collections:2. Conference Papers

Files in This Item:
There are no files associated with this item.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.