Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 8 of 8
  • Item
    Alignment based similarity distance measure for better web sessions clustering
    (Elsevier B.V., 2011) Poornalatha, G.; Raghavendra, P.S.
    The evolution of the internet along with the popularity of the web has attracted a great attention among the researchers to web usage mining. Given that, there is an exponential growth in terms of amount of data available in the web that may not give the required information immediately; web usage mining extracts the useful information from the huge amount of data available in the web logs that contain information regarding web pages accessed. Due to this huge amount of data, it is better to handle small group of data at a time, instead of dealing with entire data together. In order to cluster the data, similarity measure is essential to obtain the distance between any two user sessions. The objective of this paper is to propose a technique, to measure the similarity between any two user sessions based on sequence alignment technique that uses the dynamic programming method. © 2011 Published by Elsevier Ltd.
  • Item
    Web user session clustering using modified K-means algorithm
    (2011) Poornalatha, G.; Raghavendra, P.S.
    The proliferation of internet along with the attractiveness of the web in recent years has made web mining as the research area of great magnitude. Web mining essentially has many advantages which makes this technology attractive to researchers. The analysis of web user's navigational pattern within a web site can provide useful information for applications like, server performance enhancements, restructuring a web site, direct marketing in e-commerce etc. The navigation paths may be explored based on some similarity criteria, in order to get the useful inference about the usage of web. The objective of this paper is to propose an effective clustering technique to group users' sessions by modifying K-means algorithm and suggest a method to compute the distance between sessions based on similarity of their web access path, which takes care of the issue of the user sessions that are of variable length. © 2011 Springer-Verlag.
  • Item
    A parallel segmentation of brain tumor from magnetic resonance images
    (2012) Dessai, V.S.; Arakeri, M.P.; Guddeti, G.
    Medical image segmentation is nowadays at the core of medical image analysis and supports computer-aided diagnosis, surgical planning, intra-operative guidance or postoperative assessment. Large amounts of research efforts have been made in developing effective brain MR (magnetic resonance) image tumor segmentation methods in the past years. However algorithms proposed so far are time consuming because it involves lot of mathematical computations. Also serial segmentation of multiple MRI slices (usually required for 3D visualization) takes exponential time. This results in need for improvement in performance as far as the time complexity is concerned. This paper proposes a methodology that incorporates the K-means clustering and morphological operation for parallel segmentation of multiple MRI slices corresponding to single patient. Segmentation of multiple MRI slices for tumor extraction plays major role in 3D (Three Dimensional) visualization and serves as an input for the same. The proposed framework follows SIMD (Single Instruction Multiple Data) model and since the segmentation of individual slice is independent of each other and can be performed in parallel and multithreading definitely speeds up the entire process. Also the framework does not involve any kind of inter-process communication thus the time is saved here as well. © 2012 IEEE.
  • Item
    Life time enhancement of wireless Sensor Network using fuzzy c-means clustering algorithm
    (Institute of Electrical and Electronics Engineers Inc., 2014) Kumar, P.; Chaturvedi, A.
    The major issues in wireless Sensor Networks (WSNs) are efficient uses of limited resources and appropriate routing of network paths under severely constrained energy scenarios. To overcome these issues; k-means and fuzzy c-means algorithms are investigated to form clusters and for subsequent selection of cluster heads. For all these clusters; selection of cluster head is done based on member sensor nodes residual energy status (RES) and estimation of Euclidean distances. Depending upon the Euclidean distance measure between the sink node and center of gravity of clusters; these clusters are classified into five types. Further, RES estimations are presented for cluster heads as well simple sensor network nodes. © 2014 IEEE.
  • Item
    An improved K-means algorithm using modified cosine distance measure for document clustering using Mahout with Hadoop
    (Institute of Electrical and Electronics Engineers Inc., 2015) Sahu, L.; Mohan, R.
    In this paper, we have proposed a novel K-means algorithm with modified Cosine Distance Measure for clustering of large datasets like Wikipedia latest articles and Reuters dataset. We are customizing Cosine Distance Measure for computing similarity between objects for improving cluster quality. Our method will calculate the similarity between objects by Cosine Distance Measure and then try to bring distance more closer by squaring the distance if it is between 0 to 0.5 else increase it. It will result in minimum Intra-cluster and maximizes Inter-cluster distance value. We are measuring cluster quality in term of Inter and Intra-cluster distances, good Feature weighting such as TF-IDF, Cluster Size and Top terms of the clusters. We have compared K-means algorithm by Cosine and modified Cosine Distance measure by setting performance metric such as Inter-cluster and Intra-cluster distances, Cluster size, Execution time etc. Our experimental result shows in minimizing Intra-cluster by 0.016% and maximizing Inter-cluster distance by 0.012%, reducing the cluster size by 1.5% and reducing sequence file size by 4%, that will result in good cluster quality. © 2014 IEEE.
  • Item
    Performance measures of fuzzy C-means algorithm in wireless sensor networks
    (Inderscience Publishers, 2017) Kumar, P.; Chaturvedi, A.
    The major issues that govern performance of wireless sensor networks (WSNs) are efficient uses of limited resources and appropriate routing decisions of network paths under the severely constrained energy scenario. In this work, to address these issues uses of k-means and fuzzy C-means algorithms are investigated for clusters formation and subsequent selection of cluster heads (CHs). For all these newly formed clusters; selection of cluster head is done based on member sensor nodes residual energy status (RES) followed by estimation of Euclidean distances. Depending upon the Euclidean distance measures between the sink node and the estimated energy-centroid (EC) of clusters these clusters are classified into five types. The RES estimation is exercised for all the CHs and sensor nodes (SNs) of the network. Outcomes of simulation results indicate superior performance of fuzzy-c means algorithm compared to k-means algorithm. Further, a case study is presented, wherein the sink is allowed to have some movements in the service area. Here, different quadrant of service area exhibits different pattern of query spatial distribution. The optimal location of sink is sought to support energy efficient operational aspects of the WSNs. © © 2017 Inderscience Enterprises Ltd.
  • Item
    Mineral identification using unsupervised classification from hyperspectral data
    (Springer, 2020) Gupta, P.; Venkatesan, M.
    Hyperspectral imagery is one of the research areas in the field of remote sensing. Hyperspectral sensors record reflectance of object or material or region across the electromagnetic spectrum. Mineral identification is an urban application in the field of remote sensing of Hyperspectral data. Challenges with the hyperspectral data include high dimensionality and size of the hyperspectral data. Principle component analysis (PCA) is used to reduce the dimension of data by band selection approach. Unsupervised classification technique is one of the hot research topics. Due to the unavailability of ground truth data, unsupervised algorithm is used to classify the minerals present in the remotely sensed hyperspectral data. K-means is unsupervised clustering algorithm used to classify the mineral and then further SVM is used to check the classification accuracy. K-means is applied to end member data only. SVM used k-means result as a labelled data and classify another set of dataset. © Springer Nature Singapore Pte Ltd 2020.
  • Item
    Benchmarking semantic, centroid, and graph-based approaches for multi-document summarization
    (Springer Science and Business Media Deutschland GmbH info@springer-sbm.com, 2021) Agrawal, A.; George, R.A.; Ravi, S.S.; Kamath S․, S.
    Multi-document summarization (MDS) is a pre-programmed process to excerpt data from various documents regarding similar topics. We aim to employ three techniques for generating summaries from various document collections on the same topic. The first approach is to calculate the importance score for each sentence using features including TF-IDF matrix as well as semantic and syntax similarity. We build our algorithm to sort the sentences by importance and add it to the summary. In the second approach, we use the k-means clustering algorithm for generating the summary. The third approach makes use of the Page Ranking algorithm wherein edges of the graph are formed between sentences that are syntactically similar but are not semantically similar. All these techniques have been used to generate 100–200 word summaries for the DUC 2004 dataset. We use ROUGE scores to evaluate the system-generated summaries with respect to the manually generated summaries. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd 2021.