In-place Parallel Super Scalar Samplesort (IPS$^4$o)
Abstract: We present a sorting algorithm that works in-place, executes in parallel, is cache-efficient, avoids branch-mispredictions, and performs work O(n log n) for arbitrary inputs with high probability. The main algorithmic contributions are new ways to make distribution-based algorithms in-place: On the practical side, by using coarse-grained block-based permutations, and on the theoretical side, we show how to eliminate the recursion stack. Extensive experiments show that our algorithm IPS$4$o scales well on a variety of multi-core machines. We outperform our closest in-place competitor by a factor of up to 3. Even as a sequential algorithm, we are up to 1.5 times faster than the closest sequential competitor, BlockQuicksort.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.