Conference Papers
Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506
Browse
3 results
Search Results
Item A novel data structure for efficient representation of large data sets in data mining(2006) Pai, R.M.; Ananthanarayana, V.S.An important goal in data mining is to generate an abstraction of the data. Such an abstraction helps in reducing the time and space requirements of the overall decision making process. It is also important that the abstraction be generated from the data in small number of scans. In this paper, we propose a novel data structure called Prefix-Postfix structure(PP-structure), which is an abstraction of the data that can be built by scanning the database only once. We prove that this structure is compact, complete and incremental and therefore is suitable to represent dynamic databases. Further, we propose a clustering algorithm using this structure. The proposed algorithm is tested on different real world datasets and is shown that the algorithm is both space efficient and time efficient for large datasets without sacrificing for the accuracy. We compare our algorithm with other algorithms and show the effectiveness of our algorithm. © 2006 IEEE.Item Prefix-Suffix trees: A novel scheme for compact representation of large datasets(Springer Verlag, 2007) Pai, R.M.; Ananthanarayana, V.S.An important goal in data mining is to generate an abstraction of the data. Such an abstraction helps in reducing the time and space requirements of the overall decision making process. It is also important that the abstraction be generated from the data in small number of scans. In this paper we propose a novel scheme called Prefix-Suffix trees for compact storage of patterns in data mining, which forms an abstraction of the patterns, and which is generated from the data in a single scan. This abstraction takes less amount of space and hence forms a compact storage of patterns. Further, we propose a clustering algorithm based on this storage and prove experimentally that this type of storage reduces the space and time. This has been established by considering large data sets of handwritten numerals namely the OCR data, the MNIST data and the USPS data. The proposed algorithm is compared with other similar algorithms and the efficacy of our scheme is thus established. © Springer-Verlag Berlin Heidelberg 2007.Item A New Glowworm Swarm Optimization Based Clustering Algorithm for Multimedia Documents(Institute of Electrical and Electronics Engineers Inc., 2016) Pushpalatha, K.; Ananthanarayana, V.S.Due to the explosion of multimedia data, the demand for the sophisticated multimedia knowledge discovery systems has been increased. The multimodal nature of multimedia data is the big barrier for knowledge extraction. The representation of multimodal data in a unimodal space will be more advantageous for any mining task. We initially represent the multimodal multimedia documents in a unimodal space by converting the multimedia objects into signal objects. The dynamic nature of the glowworms motivated us to propose the Glowworm Swarm Optimization based Multimedia Document Clustering (GSOMDC) algorithm to group the multimedia documents into topics. The better purity and entropy values indicates that the GSOMDC algorithm successfully clusters the multimedia documents into topics. The goodness of the clustering is evaluated by performing the cluster based retrieval of multimedia documents with better precision values. © 2015 IEEE.
