2000 character limit reached
Reliable Distributed Clustering with Redundant Data Assignment
Published 20 Feb 2020 in cs.DC, cs.DS, cs.IT, cs.LG, and math.IT | (2002.08892v1)
Abstract: In this paper, we present distributed generalized clustering algorithms that can handle large scale data across multiple machines in spite of straggling or unreliable machines. We propose a novel data assignment scheme that enables us to obtain global information about the entire data even when some machines fail to respond with the results of the assigned local computations. The assignment scheme leads to distributed algorithms with good approximation guarantees for a variety of clustering and dimensionality reduction problems.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.