2. Conference Papers
Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/1/7
Browse
37 results
Search Results
Item Resource aware scheduling in Hadoop for heterogeneous workloads based on load estimation(2013) Kapil, B.S.; Sowmya, Kamath S.Currently, most cloud based applications require large scale data processing capability. Data to be processed is growing at a rate much faster than available computing power. Hadoop is used to enable distributed processing on large clusters of commodity hardware. In large clusters, the workloads may be heterogeneous in nature, that is, I/O bound, CPU bound or network intensive jobs that demand different types of resources requirement so as to run simultaneously on large cluster. Hadoops job scheduling is based on FIFO where, parallelization based on types of job has not been taken into account for scheduling. In this paper, we propose a new scheduling algorithm for Hadoop based distributed system, based on the classification of workloads to assign a specific category to a particular cluster according to current load of the cluster. The proposed scheduler increases the performance of both CPU and I/O resources in a cluster under heterogeneous workloads, by approximately 12% when compared to Hadoops FIFO scheduler. � 2013 IEEE.Item Query-oriented unsupervised multi-document summarization on big data(2016) Sunaina; Sowmya, Kamath S.Real time document summarization is a critical need nowadays, owing to the large volume of information available for our reading, and our inability to deal with this entirely due to limitations of time and resources. Oftentimes, information is available in multiple sources, offering multiple contexts and viewpoints on a single topic of interest. Automated multi-document summarization (MDS) techniques aim to address this problem. However, current techniques for automated MDS suffer from low precision and accuracy with reference to a given subject matter, when compared to those summaries prepared by humans and takes large time to create the summary when the input given is too huge. In this paper, we propose a hybrid MDS technique combining feature based algorithms and dynamic programming for generating a summary from multiple documents based on user provided query. Further, in real-world scenarios, Web search serves up a large number of URLs to users, and the work of making sense of these with reference to a particular query is left to the user. In this context, an efficient parallelized MDS technique based on Hadoop is also presented, for serving a concise summary of multiple Webpage contents for a given user query in reduced time duration. � 2016 ACM.Item Performance evaluation of web browsers in Android(2013) Harsha, Prabha, E.; Piraviperumal, D.; Naik, D.; Sowmya, Kamath S.; Prasad, G.In this day and age, smart phones are fast becoming ubiquitous. They have evolved from their traditional use of solely being a device for communication between people, to a multipurpose device. With the advent of Android smart phones, the number of people accessing the Internet through their mobile phones is on a steep rise. Hence, web browsers play a major role in providing a highly enjoyable browsing experience for its users. As such, the objective of this paper is to analyze the performance of five major mobile web browsers available in the Android platform. In this paper, we present the results of a study conducted based on several parameters that assess these mobile browsers' functionalities. Based on this evaluation, we also propose the best among these browsers to further enrich user experience of mobile web browsing along with utmost performance. � 2013 Springer Science+Business Media New York.Item Ontology based approach for event detection in twitter datastreams(2015) Kaushik, R.; Apoorva, Chandra, S.; Mallya, D.; Chaitanya, J.N.V.K.; Sowmya, Kamath S.In this paper, we present a system that attempts to interpret relations in social media data based on automatically constructed dataset-specific ontology. Twitter data pertaining to the real world events such as the launch of products and the buzz generated by it, among the users of Twitter for developing a prototype of the system. Twitter data is filtered using certain tag-words which are used to build an ontology, based on extracted entities. Wikipedia data on the entities are collected and processed semantically to retrieve inherent relations and properties. The system uses these results to discover related entities and the relationships between them. We present the results of experiments to show how the system was able to effectively construct the ontology and discover inherent relationships between the entities belonging to two different datasets. � 2015 IEEE.Item Ontology based algorithms for indexing and search of semantically close natural language phrases(2007) Sowmya, Kamath S.Free text constitutes a overwhelming fraction of information available on the World Wide Web. Specifically, consider small chunks of natural language phrases frequently used by Web users to describe stuff relevant to them. For example, consider the following two posts on a classifieds site (which serves a small locality, say, a university campus) - "2 Tickets for the prom tonight" and "Trade 2 extra passes for tonight's Ball for $25". For a human looking at these two posts, its trivial to conclude that he has found what he wanted. But when there are thousands of such posts and in the absence of any common keywords or any additional information from the user it is unlikely that naive keyword based matching will be of any help in reflecting the glaring similarity between these descriptions. This problem is very relevant and challenging because users tend to describe the same item in several dif ferent ways. Humans frequently use their commonsense and background knowledge to infer that these relate to the same item. However the enormous sizes of most datasets prohibit manual classification. To automate this, we present intuitive and scalable algorithms which use existing Ontologies like WordNet to correctly relate semantically close descriptions.Item NLP based intelligent news search engine using information extraction from e-newspapers(2015) Kanakaraj, M.; Sowmya, Kamath S.Extracting text information from a web news page is a challenging task as most of the E-News content is provided with support from backend Content Management Systems (CMSs). In this paper, we present a personalized news search engine that focuses on building a repository of news articles by applying efficient extraction of text information from a web news page from varied e-news portals. The system is based on the concept of Document Object Model(DOM) tree manipulation for extracting text and modifying the web page structure to exclude irrelevant content like ads and user comments. We also use WordNet, a thesaurus of English language based on psycholinguist studies for matching the extracted content semantically to the title of the web page. TF-IDF (Term Frequency Inverse Document Frequency) is used for identifying the web page blocks carrying information relevant to the pages title. In addition to the extraction of information, functionalities to gather related information from different web news papers and to summarize the gathered information based on user preferences have also been included. We observed that the system was able to achieve good recall and high precision for both generalized and specific queries. � 2014 IEEE.Item Machine learning for mobile wound assessment(2018) Sowmya, Kamath S.; Sirazitdinova, E.; Deserno, T.M.Chronic wounds affect millions of people around the world. In particular, elderly persons in home care may develop decubitus. Here, mobile image acquisition and analysis can provide a good assistance. We develop a system for mobile wound capture using mobile devices such as smartphones. The photographs are acquired with the integrated camera of the device and then calibrated and processed to determine the size of various tissues that are present in a wound, i.e., necrotic, sloughy, and granular tissue. The random forest classifier based on various color and texture features is used for that. These features are Sobel, Hessian, membrane projections, variance, mean, median, anisotropic diffusion, and bilateral as well as Kuwahara filters. The resultant probability output is thresholded using the Otsu technique. The similarity between manual ground truth labeling and the classification is measured. The acquired results are compared to those achieved with a basic technique of color thresholding, as well as those produced by the SVM classifier. The fast random forest was found to produce better results. It is also seen to have a superior performance when the method is applied only to the wound regions having the background subtracted. Mean similarity is 0.89, 0.39, and 0.44 for necrotic, sloughy, and granular tissue, respectively. Although the training phase is time consuming, the trained classifier performs fast enough to be implemented on the mobile device. This will allow comprehensive monitoring of skin lesions and wounds. � 2018 SPIE.Item Improved approximation algorithm for vertex cover problem using articulation points(2014) Patel, S.; Sowmya, Kamath S.There has been many vertex cover algorithms proposed for the solution of well-known NP-complete class problem of vertex cover. The Vertex Cover problem is important to address in graphs as it has various real world applications viz. Wireless Communication Network, Airline Communication Network, Terrorist Communication Network etc. In this paper, we propose a new algorithm based on Articulation Point, which reduces the vertex cover computation problem in polynomial time and yield solution nearer to an optimal solution, better than the classical approach. We also present a Graphical Visualization Tool that allows the automatic application of the Improved Articulation Point based Approximation Algorithm to process large graphs and finds their articulation points for minimal vertex cover computation. The tool is currently under development.Item Improved speculative Apriori with percentiles algorithm for website restructuring based on usage patterns(2016) Gahlot, G.; Sowmya, Kamath S.Web structure mining techniques are popularly used in the process of improved website design/replanning based on user browsing actions. In this paper, an algorithm for improving the design map (site map of a Website) using the pertinent information available in the website's server logs is proposed, that incorporates probability for extending the well-known Apriori Algorithm. The proposed methodology harnesses the normal distribution curve used in statistical measurements to improve recommendation accuracy after parsing the server log file. This allows the discovery of more association rules as the idea is to use percentile calculations instead of the percentages and having a relative quest within the item sets to determine their existence in the domain. By enforcing the percentile calculations on the distribution curve of the collection, selective items from the small groups within can be obtained. Experimental results for the proposed Speculative Apriori with Percentiles Algorithm (SAwP) indicate that it was effective in discovering relevant itemsets and more association rules, when compared to classical Apriori algorithm. � 2016 IEEE.Item Graph energy ranking for scale-free networks using Barabasi-Albert model(2019) Mahadevi, S.; Sowmya, Kamath S.A social network is a vast collection of actors and interactions. It forms one of the complex networks. There are various types of social networks such as acquaintance networks, online social networks, covert networks, citation networks, and collaboration networks, etc. Most of these real-world networks are scale-free, and they follow a power-law distribution. Each of these networks has nodes which have various roles to play, and all nodes are not equally important. Hence we need to rank them based on their importance. In this paper, we propose an algorithm named Graph Energy Ranking (GER) to rank the nodes of scale-free networks built using the Barabasi-Albert model. GER analyses the impact of node deletion on the underlying network and therefore gives a better understanding of the network features. Study of ranking done by existing centrality measures versus GER is performed to observe the similarity in the ranking process. �2019 IEEE.