Papers
Topics
Authors
Recent
Search
2000 character limit reached

Direct N-Body problem optimisation using the AVX-512 instruction set

Published 21 Jun 2021 in physics.comp-ph and physics.atom-ph | (2106.11143v1)

Abstract: The integration of the equations of motion of N interacting particles, represents a classical problem in many branches of physics and chemistry. The direct N-body problem is at the heart of simulations studying Coulomb Crystals. We present an hand-optimized code for the latest AVX-512 set of instructions that achieve a single core speed up of $\approx 340\%$ respect the version optimized by the compiler. The increase performance is due a optimization on the organization of the memory access on the inner loop on the Coulomb and, specially, on the usage of an intrinsic function to faster compute the $1/\sqrt{x}$. Our parallelization, which is implemented in OpenMP, achieves an excellent scalability with the number of cores. In total, we achieve $\approx 500GFLOPS$ using a just a standard WorkStation with one Intel Skylake CPU (10 cores). It represents $\approx 75\%$ of the theoretical maximum number of double precision FLOPS corresponding to Fused Multiplication Addition (FMA) operations.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.