Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 3 of 3
  • Item
    Semantic sentiment analysis using context specific grammar
    (Institute of Electrical and Electronics Engineers Inc., 2015) Bhuvan, B.M.; Rao, V.D.; Jain, S.; Ashwin, T.S.; Guddeti, G.
    The increasing number of e-commerce and social networking sites are producing large amount of data pertaining to reviews of a product, restaurant etc. A keen observation reveals that the text data gathered from any social review site are specific to a context and are subjective in nature promoting varied perceptions of sentiments. The novel idea is to define context specific grammar as semantics for a particular domain. Our research aims to develop a scalable model where features obtained from matching semantic patterns are used to predict the sentiment polarity of movie reviews and also provide a sentiment score for each review. The proposed model is intended to be flexible so that it could be applied to any domain by redefining the semantics specific to that domain. There are many other models which give accuracies greater than 80% using various methods. A study suggests that 70% accurate program is as good as humans as they have varied perceptions of sentiment about a movie review as it is a subjective summary of a movie. Our model might give lesser accuracy but it uses a cognitive approach trying to catch these varied perceptions by learning from a combination of positive and negative grammars. Analyzing results from various experiments we find that Logistic Regression with SGD on Apache Spark performs better with accuracy of 64.12% while being highly scalable. High dependency on the grammars is a limitation of the model. Improvements can be done by defining different quality and quantity of grammars. © 2015 IEEE.
  • Item
    Evaluation of Machine Learning Frameworks on Bank Marketing and Higgs Datasets
    (Institute of Electrical and Electronics Engineers Inc., 2015) Bhuvan, B.M.; Jain, S.; Rao, V.D.; Patil, N.; Raghavendra, G.S.
    Big data is an emerging field with different datasets of various sizes are being analyzed for potential applications. In parallel, many frameworks are being introduced where these datasets can be fed into machine learning algorithms. Though some experiments have been done to compare different machine learning algorithms on different data, these experiments have not been tested out on different platforms. Our research aims to compare two selected machine learning algorithms on data sets of different sizes deployed on different platforms like Weka, Scikit-Learn and Apache Spark. They are evaluated based on Training time, Accuracy and Root mean squared error. This comparison helps us to decide what platform is best suited to work while applying computationally expensive selected machine learning algorithms on a particular size of data. Experiments suggested that Scikit-Learn would be optimal on data which can fit into memory. While working with huge, data Apache Spark would be optimal as it performs parallel computations by distributing the data over a cluster. Hence this study concludes that spark platform which has growing support for parallel implementation of machine learning algorithms could be optimal to analyze big data. © 2015 IEEE.
  • Item
    Enhancing Movie Recommendation Systems with MapReduce Genetic Algorithms: Addressing Scalability and Accuracy Challenges
    (Institute of Electrical and Electronics Engineers Inc., 2024) Patidar, P.; Posa, S.V.; Girish, K.K.; Rao, S.; Bhowmik, B.
    In the world of big data, the efficacy of movie recommendation systems is crucial for personalizing user experiences in digital entertainment. Traditional methods, including collaborative and content-based filtering, often encounter limitations such as data sparsity, cold start problems, and scalability issues. This paper introduces a novel approach that integrates MapReduce technology with Genetic Algorithms (GAs) to address these chal-lenges. Utilizing the Hadoop framework, our MapReduce Genetic Algorithm (MRGA) efficiently processes extensive datasets by distributing tasks across a cluster of machines. The genetic algorithm component optimizes recommendation accuracy through advanced techniques like selection, crossover, and mutation. Our experimental results, based on the MovieLens 100K dataset, demonstrate that the MRGA approach outperforms traditional collaborative filtering methods in terms of recommendation accuracy and scalability. By leveraging MapReduce's distributed computing power and the GA's optimization capabilities, this research offers a robust solution to improve movie recommendations and handle large-scale data efficiently. © 2024 IEEE.