Please use this identifier to cite or link to this item:
Title: Genome Data Analysis Using MapReduce Paradigm
Authors: Pahadia, M.
Srivastava, A.
Srivastava, D.
Patil, N.
Issue Date: 2015
Citation: Proceedings - 2015 2nd IEEE International Conference on Advances in Computing and Communication Engineering, ICACCE 2015, 2015, Vol., , pp.556-559
Abstract: Counting the number of occurences of a substringin a string is a problem in many applications. This paper suggests a fast and efficient solution for the field of bioinformatics. Ak-mer is a k-length sub string of a biological sequence. K-mercounting is defined as counting the number of occurences of all the possible k-mers in a biological sequence. K-mer counting has uses in applications ranging from error correction of sequencing reads, genome assembly, disease prediction and feature extraction. The current k-mer counting tools are both time and space costly. We provide a solution which uses MapReduce and Hadoop to reduce the time complexity. After applying the algorithms on real genome datasets, we concluded that the algorithm using Hadoopand MapReduce Paradigm runs more efficiently and reduces the time complexity significantly. � 2015 IEEE.
Appears in Collections:2. Conference Papers

Files in This Item:
There are no files associated with this item.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.