Papers
Topics
Authors
Recent
Search
2000 character limit reached

Measuring the Optimality of Hadoop Optimization

Published 10 Jul 2013 in cs.DC and cs.PF | (1307.2915v1)

Abstract: In recent years, much research has focused on how to optimize Hadoop jobs. Their approaches are diverse, ranging from improving HDFS and Hadoop job scheduler to optimizing parameters in Hadoop configurations. Despite their success in improving the performance of Hadoop jobs, however, very little is known about the limit of their optimization performance. That is, how optimal is a given Hadoop optimization? When a Hadoop optimization method X improves the performance of a job by Y %, how do we know if this improvement is as good as it can be? To answer this question, in this paper, we first examine the ideal best case, the lower bound, of running time for Hadoop jobs and develop a measure to accurately estimate how optimal a given Hadoop optimization is with respect to the lower bound. Then, we demonstrate how one may exploit the proposed measure to improve the optimization of Hadoop jobs.

Citations (3)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.