Papers
Topics
Authors
Recent
Search
2000 character limit reached

HDBSCAN: Density based Clustering over Location Based Services

Published 11 Feb 2016 in cs.DB | (1602.03730v2)

Abstract: Location Based Services (LBS) have become extremely popular and used by millions of users. Popular LBS run the entire gamut from mapping services (such as Google Maps) to restaurants (such as Yelp) and real-estate (such as Redfin). The public query interfaces of LBS can be abstractly modeled as a kNN interface over a database of two dimensional points: given an arbitrary query point, the system returns the k points in the database that are nearest to the query point. Often, k is set to a small value such as 20 or 50. In this paper, we consider the novel problem of enabling density based clustering over an LBS with only a limited, kNN query interface. Due to the query rate limits imposed by LBS, even retrieving every tuple once is infeasible. Hence, we seek to construct a cluster assignment function f(.) by issuing a small number of kNN queries, such that for any given tuple t in the database which may or may not have been accessed, f(.) outputs the cluster assignment of t with high accuracy. We conduct a comprehensive set of experiments over benchmark datasets and popular real-world LBS such as Yahoo! Flickr, Zillow, Redfin and Google Maps.

Citations (15)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.