Papers
Topics
Authors
Recent
Search
2000 character limit reached

Equi-depth Histogram Construction for Big Data with Quality Guarantees

Published 17 Jun 2016 in cs.DB and cs.DC | (1606.05633v1)

Abstract: The amount of data generated and stored in cloud systems has been increasing exponentially. The examples of data include user generated data, machine generated data as well as data crawled from the Internet. There have been several frameworks with proven efficiency to store and process the petabyte scale data such as Apache Hadoop, HDFS and several NoSQL frameworks. These systems have been widely used in industry and thus are subject to several research. The proposed data processing techniques should be compatible with the above frameworks in order to be practical. One of the key data operations is deriving equi-depth histograms as they are crucial in understanding the statistical properties of the underlying data with many applications including query optimization. In this paper, we focus on approximate equi-depth histogram construction for big data and propose a novel merge based histogram construction method with a histogram processing framework which constructs an equi-depth histogram for a given time interval. The proposed method constructs approximate equi-depth histograms by merging exact equi-depth histograms of partitioned data by guaranteeing a maximum error bound on the number of items in a bucket (bucket size) as well as any range on the histogram.

Citations (6)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.