Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 3 of 3
  • Item
    NLP based intelligent news search engine using information extraction from e-newspapers
    (Institute of Electrical and Electronics Engineers Inc., 2014) Kanakaraj, M.; Kamath S․, S.
    Extracting text information from a web news page is a challenging task as most of the E-News content is provided with support from backend Content Management Systems (CMSs). In this paper, we present a personalized news search engine that focuses on building a repository of news articles by applying efficient extraction of text information from a web news page from varied e-news portals. The system is based on the concept of Document Object Model(DOM) tree manipulation for extracting text and modifying the web page structure to exclude irrelevant content like ads and user comments. We also use WordNet, a thesaurus of English language based on psycholinguist studies for matching the extracted content semantically to the title of the web page. TF-IDF (Term Frequency Inverse Document Frequency) is used for identifying the web page blocks carrying information relevant to the pages title. In addition to the extraction of information, functionalities to gather related information from different web news papers and to summarize the gathered information based on user preferences have also been included. We observed that the system was able to achieve good recall and high precision for both generalized and specific queries. © 2014 IEEE.
  • Item
    Performance analysis of Ensemble methods on Twitter sentiment analysis using NLP techniques
    (Institute of Electrical and Electronics Engineers Inc., 2015) Kanakaraj, M.; Guddeti, G.
    Mining opinions and analyzing sentiments from social network data help in various fields such as even prediction, analyzing overall mood of public on a particular social issue and so on. This paper involves analyzing the mood of the society on a particular news from Twitter posts. The key idea of the paper is to increase the accuracy of classification by including Natural Language Processing Techniques (NLP) especially semantics and Word Sense Disambiguation. The mined text information is subjected to Ensemble classification to analyze the sentiment. Ensemble classification involves combining the effect of various independent classifiers on a particular classification problem. Experiments conducted demonstrate that ensemble classifier outperforms traditional machine learning classifiers by 3-5%. © 2015 IEEE.
  • Item
    NLP based sentiment analysis on Twitter data using ensemble classifiers
    (Institute of Electrical and Electronics Engineers Inc., 2015) Kanakaraj, M.; Guddeti, G.
    Most sentiment analysis systems use bag-of-words approach for mining sentiments from the online reviews and social media data. Rather considering the whole sentence/ paragraph for analysis, the bag-of-words approach considers only individual words and their count as the feature vectors. This may mislead the classification algorithm especially when used for problems like sentiment classification. Traditional machine learning algorithms like Naive Bayes, Maximum Entropy, SVM etc. are widely used to solve the classification problems. These machine learning algorithms often suffer from biasness towards a particular class. In this paper, we propose Natural Language (NLP) based approach to enhance the sentiment classification by adding semantics in feature vectors and thereby using ensemble methods for classification. Adding semantically similar words and context-sense identities to the feature vectors will increase the accuracy of prediction. Experiments conducted demonstrate that the semantics based feature vector with ensemble classifier outperforms the traditional bag-of-words approach with single machine learning classifier by 3-5%. © 2015 IEEE.