Improved resource provisioning in Hadoop

No Thumbnail Available

Date

2016

Authors

Divya, M.
Annappa, B.

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Extensive use of the Internet is generating large amount of data. The mechanism to handle and analyze these data is becoming complicated day by day. The Hadoop platform provides a solution to process huge data on large clusters of nodes. Scheduler play a vital role in improving the performance of Hadoop. In this paper, MRPPR: MapReduce Performance Parameter based Resource aware Hadoop Scheduler is proposed. In MRPPR, performance parameters of Map task such as the time required for parsing the data, map, sort and merge the result, and of Reduce task, such as the time to merge, parse and reduce is considered to categorize the job as CPU bound, Disk I/O bound or Network I/O bound. Based on the node status obtained from the TaskTracker�s response, nodes in the cluster are classified as CPU busy, Disk I/O busy or Network I/O busy. A cost model is proposed to schedule a job to the node based on the classification to minimize the makespan and to attain effective resource utilization. A performance improvement of 25�30 % is achieved with our proposed scheduler. � Springer India 2016.

Description

Keywords

Citation

Smart Innovation, Systems and Technologies, 2016, Vol.44, , pp.39-49

Endorsement

Review

Supplemented By

Referenced By