Papers
Topics
Authors
Recent
Search
2000 character limit reached

VIP Hashing -- Adapting to Skew in Popularity of Data on the Fly (extended version)

Published 24 Jun 2022 in cs.DB | (2206.12380v1)

Abstract: All data is not equally popular. Often, some portion of data is more frequently accessed than the rest, which causes a skew in popularity of the data items. Adapting to this skew can improve performance, and this topic has been studied extensively in the past for disk-based settings. In this work, we consider an in-memory data structure, namely hash table, and show how one can leverage the skew in popularity for higher performance. Hashing is a low-latency operation, sensitive to the effects of caching, branch prediction, and code complexity among other factors. These factors make learning in-the-loop especially challenging as the overhead of performing any additional operations can be significant. In this paper, we propose VIP hashing, a fully online hash table method, that uses lightweight mechanisms for learning the skew in popularity and adapting the hash table layout. These mechanisms are non-blocking, and their overhead is controlled by sensing changes in the popularity distribution to dynamically switch-on/off the learning mechanism as needed. We tested VIP hashing against a variety of workloads generated by Wiscer, a homegrown hashing measurement tool, and find that it improves performance in the presence of skew (22% increase in fetch operation throughput for a hash table with one million keys under low skew, 77% increase under medium skew) while being robust to insert and delete operations, and changing popularity distribution of keys. We find that VIP hashing reduces the end-to-end execution time of TPC-H query 9, which is the most expensive TPC-H query, by 20% under medium skew.

Citations (7)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.