Conference Papers
Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506
Browse
10 results
Search Results
Item Fuzzy sentiment analysis on microblogs for movie revenue prediction(IEEE Computer Society, 2013) Gupta, N.; Abhinav, K.R.; Annappa, B.With the advent of microblogging in recent years, people voice their views about products, especially movies. Microblogs are rich sources of data that can be analyzed to derive useful knowledge like larger public opinion on a product, which can be utilized to derive sales performance patterns. In this paper we propose a novel fuzzy approach for evaluating sentiments expressed in microblogs, which are incorporated in text mining methodologies to predict weekly movie revenues. © 2013 IEEE.Item Performance analysis of Ensemble methods on Twitter sentiment analysis using NLP techniques(Institute of Electrical and Electronics Engineers Inc., 2015) Kanakaraj, M.; Guddeti, G.Mining opinions and analyzing sentiments from social network data help in various fields such as even prediction, analyzing overall mood of public on a particular social issue and so on. This paper involves analyzing the mood of the society on a particular news from Twitter posts. The key idea of the paper is to increase the accuracy of classification by including Natural Language Processing Techniques (NLP) especially semantics and Word Sense Disambiguation. The mined text information is subjected to Ensemble classification to analyze the sentiment. Ensemble classification involves combining the effect of various independent classifiers on a particular classification problem. Experiments conducted demonstrate that ensemble classifier outperforms traditional machine learning classifiers by 3-5%. © 2015 IEEE.Item NLP based sentiment analysis on Twitter data using ensemble classifiers(Institute of Electrical and Electronics Engineers Inc., 2015) Kanakaraj, M.; Guddeti, G.Most sentiment analysis systems use bag-of-words approach for mining sentiments from the online reviews and social media data. Rather considering the whole sentence/ paragraph for analysis, the bag-of-words approach considers only individual words and their count as the feature vectors. This may mislead the classification algorithm especially when used for problems like sentiment classification. Traditional machine learning algorithms like Naive Bayes, Maximum Entropy, SVM etc. are widely used to solve the classification problems. These machine learning algorithms often suffer from biasness towards a particular class. In this paper, we propose Natural Language (NLP) based approach to enhance the sentiment classification by adding semantics in feature vectors and thereby using ensemble methods for classification. Adding semantically similar words and context-sense identities to the feature vectors will increase the accuracy of prediction. Experiments conducted demonstrate that the semantics based feature vector with ensemble classifier outperforms the traditional bag-of-words approach with single machine learning classifier by 3-5%. © 2015 IEEE.Item Towards sentiment orientation data set enrichment(Association for Computing Machinery acmhelp@acm.org, 2016) Sankaranarayanan, S.; Ingale, D.; Bhambhu, R.; Chandrasekaran, K.Sentiment orientation data sets referred to variously as affective word lists, opinion lexicons, sentiment lexicons, emotion lexicons or sentiment dictionaries contain a list of words scored for the degree of positive and negative emotion they exhibit. Although these lists have been used extensively for the sentiment analysis of text data, they contain a limited number of words that are often inadequate for data obtained from modern text sources dominated by the inuence of social media that has resulted in the creation and coining of new words on a regular basis. In an effort to enrich these data sets with new words, we propose two methods. The first method involves the sentiment analysis of portmanteau words. We have hypothesized that the sentiment score of a portmanteau word; which is a combination of two (or more) words and their meanings into a single new word; can be determined as a function of the sentiment scores of its component words. Regression analysis has been used to determine this functional relationship and several cases arising from the above have been evaluated on a data set constructed from SentiWordNet. The second method is an in situ approach for sentiment discovery for unknown words that uses labeled tweets and words from the sentiment orientation data set as inputs to discover the sentiment score of the unknown word. In order to validate the resultant score, we have also used a novel validation-feedback mechanism akin to crossvalidation. Both these methods produce acceptable levels of accuracy proving that they can be implemented in practice. © 2016 ACM.Item A personalized recommender system using Machine Learning based Sentiment Analysis over social data(Institute of Electrical and Electronics Engineers Inc., 2016) Ashok, M.; Rajanna, S.; Joshi, P.V.; Kamath S․, S.S.Social Media platforms are already an indispensable part of our daily lives. With its constant growth, it has contributed to superfluous, heterogeneous data which can be overwhelming due to its volume and velocity, thus limiting the availability of relevant and required information when a particular query is to be served. Hence, a need for personalized, fine-grained user preference-oriented framework for resolving this problem and also, to enhance user experience is increasingly felt. In this paper, we propose a such a social framework, which extracts user's reviews, comments of restaurants and points of interest such as events and locations, to personalize and rank suggestions based on user preferences. Machine Learning and Sentiment Analysis based techniques are used for further optimizing search query results. This provides the user with quicker and more relevant data, thus avoiding irrelevant data and providing much needed personalization. © 2016 IEEE.Item Sentiment extraction from naturalistic video(Elsevier B.V., 2018) Radhakrishnan, V.; Joseph, C.; Chandrasekaran, K.Sentiment analysis on video is quite an unexplored field of research wherein the emotion and sentiment of the speaker are extracted by processing the frames, audio and text obtained from the video. In recent times, sentiment analysis from naturalistic audio has been an upcoming field of research. This is typically done by performing automatic speech recognition on audio, followed by extracting the sentiment exhibited by the speaker. On the other hand, techniques for extracting sentiments from text are quite developed and tech giants have already optimized these methods to process large amounts of customer review, feedback and reactions. In this paper, a new model for sentiment analysis from audio is proposed which is a hybrid of Keyword Spotting System (KWS) and Maximum Entropy (ME) Classifier System. This model is developed with the aim to outperform other conventional classifiers and to provide a single integrated system for audio and text processing. In addition, a web application for dynamic processing of YouTube videos is described. The WebApp provides an index-based result for each phrase that is detected in the video. Often, the emotion of the viewer of a video corresponds to its content. In this regard, it is useful to map these emotions to the text transcript of the video and assign a suitable weight to it while predicting the sentiment that the speaker exhibits. This paper describes such an application that was developed to analyze facial expressions using Affdex API. Thus, using the combined statistics from all the three aforementioned components, a robust and portable system for emotion detection is obtained that provides accurate predictions and can be deployed on any modern systems with minimal configuration changes. © 2018 The Authors. Published by Elsevier B.V.Item A Bag-of-Phonetic-Codes Modelfor Cyber-Bullying Detection in Twitter(Institute of Electrical and Electronics Engineers Inc., 2018) Shekhar, A.; Venkatesan, M.Social networking sites such as Twitter, Facebook, MySpace, Instagram are emerging as a strong medium of communication these days. These have become a part and parcel of daily life. People can express their thoughts and activities among their social circle with brings them closer to their community. However this freedom of expression has its drawbacks. Sometimes people show their aggression on Social Media which in turn hurts the sentiments of the targeted victims. Certain forms of cyber-bullying are sexual, racial and physical disability based. Hence a proper surveillance is necessary to tackle such situations. Twitter as a micro-blogging site sees cyber abuse on a daily basis. However, tweets are raw texts; containing a lot of misspelled words and censored words. This paper proposes a novel method to detect cyber-bullying, a Bag-of-Phonetic-Codes model. Using pronunciation of words as features can rectify misspelled words and can identify censored words. Correctly identifying duplicate words can lead to smaller vocabulary of words, thereby reducing the feature space. The inspiration for this proposed work is drawn from the famous Bag-of-Words model for extracting textual features. Phonetic code generation has been done using the Soundex Algorithm. Besides the proposed model, experiments were carried out with both supervised and unsupervised machine learning approaches on multiple datasets to understand the approaches and challenges in the domain of cyber-bullying detection. © 2018 IEEE.Item Conversational Hate-Offensive detection in Code-Mixed Hindi-English Tweets(CEUR-WS, 2021) Rajalakshmi, R.; Srivarshan, S.; Mattins, F.; Kaarthik, E.; Seshadri, P.; Anand Kumar, M.Hate speech in social media has increased due to the increased use of online forums for sharing the opinion among the people. Especially, people prefer expressing the views in their native language while posting such objectionable contents in many social media platforms. It is a challenging task to have an automated system to identify such hate and offensive tweets in many regional languages due to the rich linguistics nature. Recently, this problem has become too complicated, due to the use of multi-lingual and code-mixed tweets. The code-mixed data includes the mixing of two languages on the granular level. A word that might not be a part of either language may be found in the data. To address the above challenges in Hindi-English tweets, we propose an efficient method by combining the IndicBERT with an effective ensemble based method. We have applied different methodologies to find a way to accurately classify whether the given tweet is considered to be Hate Speech or Not in code-mixed Hinglish dataset. Three different models namely, IndicBERT, XLM Roberta and Masked LM were used to embed the tweet data. Then various classification methods such as Logistic Regression, Support Vector Machine, Ensembling and Neural Networks based method were applied to perform classification. From extensive experiments on the data set, embedding the code-mixed data with IndicBERT and Ensembling was found to be the best method, which resulted in an macro F1-score of 62.53%. This work was submitted to the shared task of the HASOC 2021 [1] [2] Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages Competition by team TNLP. © 2021 Copyright for this paper by its authors.Item Sentiment Analysis and Homophobia Detection of YouTube Comments(CEUR-WS, 2022) Ugursandi, S.; Anand Kumar, A.M.Sentiment analysis identifies a graded scale of opinions or emotional responses to a particular subject. Many industries and organisations have been actively researching this area for more than 20 years. The key to understand a user’s behaviour while responding on a social media site is to understand their feelings. In contemporary research, a sentence’s content is evaluated, the emotion predicted, that helps researchers gain an insight on the reaction of an individual towards a social media topic. Here, a sentence’s text data is analysed using several Natural Language Processing techniques before being utilised to categorise this multi-class issue. The detection of homophobia and transphobia in comments on YouTube or other social media sites is second objective of this work. Anger, discomfort, or suspicion against Lesbian, Gay, Bisexual and Transgender people is known as homophobia. It can incite individuals to feel panic, dislike, disrespect, aggression, or wrath. By identifying such occurrences on social media, we can better understand how society works and how people behave. The goal of this work is to analyze social media texts such as comments from YouTube and detect homophobic sentiments using deep learning or machine learning models. In this work 6-layer classification model is used, the F1-Score for sentiment identification using the proposed model in this study was 0.5 on multi-class classification and 0.97 on homophobic/transphobic classification and achieved 1st rank on Homophobic detection in Malayalam language and 4th rank for sentiment analysis in Kannada language. © 2022 Copyright for this paper by its authors.Item Revealing Insights: Sentiment Analysis of Indian Annual Reports(Institute of Electrical and Electronics Engineers Inc., 2024) Chaithra; Mohan, B.R.Annual reports are the corporate documents companies publish every year. These documents contain crucial company performance information and are often analyzed manually and objectively. The Investor often ignores the annual report's qualitative data and focuses only on quantitative data. In literature, it has been demonstrated that managers' word choices, CSR initiatives, and sentiments expressed in annual reports are related to future stock returns, earnings, and management fraud. Therefore, the study aims to observe sentiment orientation in CEO letters, Management Discussion and Analysis(MD&A), and Corporate Social Responsibility (CSR) and examine the sentiment relation with company performance. The NSE-listed company annual reports are considered for the study. In the proposed approach, the results of the LM Dictionary-Based technique, Naive Bayes, SVM, RF, LSTM, and FinBERT model are considered to determine the final sentiment. The annual report tone is calculated and compared with the performance indicators, i.e., Return on Assets(ROA) and Return on Equity(ROE). © 2024 IEEE.
