Determine optimal Unified Memory pool settings for large-mesh assembly performance
Determine the optimal configuration of the NVIDIA HPC SDK nvc++ Unified Memory pool parameters (NVCOMPILER_ACC_POOL_ALLOC, NVCOMPILER_ACC_POOL_SIZE, and NVCOMPILER_ACC_POOL_THRESHOLD) to maximize assembly-phase performance of the OpenFOAM laplacianFoam proof-of-concept across mesh sizes and GPU architectures, thereby preventing per-iteration deallocation and associated slowdowns observed under default settings.
References
Although interesting, this work is not focused on deeply investigating the best allocator size setting and leave this analysis for future work.
— Building an Accelerated OpenFOAM Proof-of-Concept Application using Modern C++
(2507.18268 - Malenza et al., 24 Jul 2025) in Section: Evaluation, Subsection: Performance results on single GPU