Cleaning and sentiment tasks for news transcript data
No Thumbnail Available
Date
2017
Authors
Lakshman, V.
Ananth, S.
Chhanchan, R.
Chandrasekaran, K.
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Today, vast amount of news in various forms is hosted on the web. They include news articles, digital newspapers, news clips, podcasts, and other sources. Traditionally, news articles and writings have been used to carry out sentiment analysis for topics. However, news channels and their transcripts represent vast data that have not been examined for business aspects. In this light, we have charted out a methodology to gather transcripts and process them for sentiment tasks by building a system to crawl Webpages for documents, index them, and aggregate them for topic analysis. Vector space model has been used for document indexing with predetermined set of topics and sentiment analysis carried out through the SentiWordNet data set, a lexical resource used for opinion mining. The areas of insight are mainly the polarity index (degree of polarity or subjectivity) of the news presented as well as their coverage. This research shows insights that can used by businesses to assess the content and quality of their content. � Springer Science+Business Media Singapore 2017.
Description
Keywords
Citation
Advances in Intelligent Systems and Computing, 2017, Vol.507, , pp.189-200