Resource aware scheduling in Hadoop for heterogeneous workloads based on load estimation

No Thumbnail Available

Date

2013

Authors

Kapil, B.S.
Sowmya, Kamath S.

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Currently, most cloud based applications require large scale data processing capability. Data to be processed is growing at a rate much faster than available computing power. Hadoop is used to enable distributed processing on large clusters of commodity hardware. In large clusters, the workloads may be heterogeneous in nature, that is, I/O bound, CPU bound or network intensive jobs that demand different types of resources requirement so as to run simultaneously on large cluster. Hadoops job scheduling is based on FIFO where, parallelization based on types of job has not been taken into account for scheduling. In this paper, we propose a new scheduling algorithm for Hadoop based distributed system, based on the classification of workloads to assign a specific category to a particular cluster according to current load of the cluster. The proposed scheduler increases the performance of both CPU and I/O resources in a cluster under heterogeneous workloads, by approximately 12% when compared to Hadoops FIFO scheduler. � 2013 IEEE.

Description

Keywords

Citation

2013 4th International Conference on Computing, Communications and Networking Technologies, ICCCNT 2013, 2013, Vol., , pp.-

Endorsement

Review

Supplemented By

Referenced By