Papers
Topics
Authors
Recent
Search
2000 character limit reached

Compiler support for semi-manual AoS-to-SoA conversions with data views

Published 21 May 2024 in cs.PL | (2405.12507v2)

Abstract: The C programming language and its cousins such as C++ stipulate the static storage of sets of structured data: Developers have to commit to one, invariant data model -- typically a structure-of-arrays (SoA) or an array-of-structs (AoS) -- unless they manually rearrange, i.e.~convert it throughout the computation. Whether AoS or SoA is favourable depends on the execution context and algorithm step. We propose a language extension based upon C++ attributes through which developers can guide the compiler what memory arrangements are to be used. The compiler can then automatically convert (parts of) the data into the format of choice prior to a calculation and convert results back afterwards. As all conversions are merely annotations, it is straightforward for the developer to experiment with different storage formats and to pick subsets of data that are subject to memory rearrangements. Our work implements the annotations within Clang and demonstrates their potential impact through a smoothed particle hydrodynamics (SPH) code.

Citations (1)

Summary

  • The paper introduces a novel compiler-driven method using C++ annotations to automate AoS-to-SoA conversions, reducing manual restructuring.
  • It employs custom attributes to specify conversion targets, sizes, inputs, and outputs, enabling local temporary data layout transformations.
  • Performance evaluations with SPH simulations on AMD and Intel systems reveal significant speedups in compute-intensive kernels.

Compiler Support for Semi-Manual AoS-to-SoA Conversions with Data Views

This paper presents a language extension within C++ that enables developers to annotate their code for automatic memory layout transformations from Array of Structs (AoS) to Structure of Arrays (SoA) and vice versa. These transformations are performed by the compiler, allowing developers to dictate the optimal memory arrangement without manual data restructuring.

Introduction

The motivation arises from the fact that the optimal data structure is context-dependent. AoS is ideal for algorithms with non-continuous access patterns or tasks requiring data permutation, while SoA is beneficial in compute-bound and vectorization-friendly situations. The manual conversion between these layouts is complex and error-prone, demanding a streamlined approach.

The proposed solution introduces C++ attributes that guide the compiler in performing these conversions. This approach allows experimentation with different storage formats without altering the underlying code. These annotations facilitate seamless transitions between AoS and SoA, optimizing performance across diverse computational phases, such as in smoothed particle hydrodynamics (SPH) simulations.

Methodology

The methodology revolves around using custom C++ attributes to dictate the desired data layout, allowing the compiler to perform temporary out-of-place conversions. Below are the key attributes introduced:

  • [[clang::soa_conversion_target]]: Designates the array to convert into SoA.
  • [[clang::soa_conversion_target_size]]: Specifies the size of the target array.
  • [[clang::soa_conversion_inputs]]: Indicates which attributes to include in the SoA conversion.
  • [[clang::soa_conversion_outputs]]: Identifies attributes to be synchronized back into the original AoS post-execution.

These annotations result in a local transformation where only specific portions of the data are rearranged temporarily, thereby minimizing overhead.

Example Annotation

1
2
3
4
5
6
7
8
9
10
11
void drift(Particle *particles, int size) {
    [[clang::soa_conversion_target(particles)]]
    [[clang::soa_conversion_target_size(size)]]
    [[clang::soa_conversion_inputs(pos, vel, updated)]]
    [[clang::soa_conversion_outputs(pos, updated)]]
    for (int i = 0; i < size; i++) {
        auto p = particles[i];
        p.pos[0] += p.vel[0] * dt;
        p.updated = true;
    }
}

Compiler Realization

Implemented within the Clang compiler, the proposed approach automates the AoS-to-SoA conversion process. It operates by parsing the abstract syntax tree (AST) to insert necessary data conversion logic, ensuring all subsequent accesses during execution refer to the optimized memory layout.

After parsing the code, temporary data structures store the rearranged data, while reshuffling ensures correct synchronizations post-computation (Figure 1). Figure 1

Figure 1

Figure 1

Figure 1

Figure 1: Scalability plots for 20482 particles on a single node for four different kernels (Genoa). We benchmark the baseline code (AoS) against a version which converts all of the particle data (SoA) against a version which works with views.

Performance Evaluation

The approach was evaluated using a SPH kernel, assessing both the impact of scalability and particle count dependency across two architectures: AMD EPYC 9654 and Intel Xeon Gold 6430.

The results indicated that while computationally simple phases benefited less due to transformation latency, compute-intensive kernels like force calculations received significant performance boosts from the memory access optimizations enabled by temporal SoA conversion (Figure 2). Figure 2

Figure 2

Figure 2

Figure 2

Figure 2: Dependency on particle counts for fixed thread numbers (8, 16 and 32). Sapphire Rapid (top) vs.~Genoa data (bottom). The L2 cache size denotes the size for a single thread, i.e.~how many particles would fit into one single L2 cache.

The study also revealed a discrepancy in throughput trends between the plain AoS and the transformed views, especially when particle interactions exceeded L2 cache capacities.

Conclusion

The paper advocates for an innovative compiler-driven strategy to alleviate the burden of manual data restructuring, mediated by a novel use of C++ annotations. The approach offers a pragmatic balance of flexibility in data layout with enhanced performance outcomes, especially beneficial in high-performance computing scenarios. The attribute-guided AoS-to-SoA conversion has potential applications beyond SPH codes, addressing broader challenges in data layout optimization across diverse architectures.

This research sets a precedent for integrating intelligent data transformations within standard compiler pipelines, proposing a path forward for further compiler enhancements and efficient utilization of evolving hardware architectures.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 8 likes about this paper.