Papers
Topics
Authors
Recent
Search
2000 character limit reached

Improving Memory Dependence Prediction with Static Analysis

Published 12 Mar 2024 in cs.PL and cs.AR | (2403.08056v3)

Abstract: This paper explores the potential of communicating information gained by static analysis from compilers to Out-of-Order (OoO) machines, focusing on the memory dependence predictor (MDP). The MDP enables loads to issue without all in-flight store addresses being known, with minimal memory order violations. We use LLVM to find loads with no dependencies and label them via their opcode. These labelled loads skip making lookups into the MDP, improving prediction accuracy by reducing false dependencies. We communicate this information in a minimally intrusive way, i.e.~without introducing additional hardware costs or instruction bandwidth, providing these improvements without any additional overhead in the CPU. We find that in select cases in Spec2017, a significant number of load instructions can skip interacting with the MDP and lead to a performance gain. These results point to greater possibilities for static analysis as a source of near zero cost performance gains in future CPU designs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. Reducing design complexity of the load/store queue. In Proc. MICRO-36, pages 411–422, 2003.
  2. Memory dependence prediction using store sets. In Proc. 25th ISCA, pages 142–153, 1998.
  3. Jason Lowe-Power et al. The gem5 Simulator: Version 20.0+. https://arxiv.org/abs/2007.03152, 2020.
  4. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proc. CGO’04, 2004.
  5. Chains of Recurrences—a Method to Expedite the Evaluation of Closed-Form Functions. In Proc. ISSAC ’94, page 242–249, 1994.
  6. D. Novillo and R. H. Canada. Memory SSA - A Unified Approach for Sparsely Representing Memory Operations. In Proc of the GCC Developers’ Summit, 2007.
  7. Practical Dependence Testing. PLDI ’91, page 15–29, 1991.
  8. Using SimPoint for Accurate and Efficient Simulation. SIGMETRICS Perform. Eval. Rev., 31(1):318–319, Jun 2003.
  9. Valgrind. https://valgrind.org/.
  10. Flang Spec2017 Compilation Status. https://github.com/flang-compiler/f18-llvm-project/issues/1476.
  11. Efficient Vector Store System for Python using Shared Memory. In Proc. AIMLSystems ’22, 2023.
  12. Otto López. Memory Dependence Prediction Methods Study and Improvement Proposals. Master’s thesis, Universitat Politècnica de Catalunya, March 2011.
  13. Cost effective speculation with the omnipredictor. pages 1–13, 11 2018.
  14. Effective Context-Sensitive Memory Dependence Prediction. In 30th Symposium on High Performance Computer Architecture (HPCA), Edinburgh, Scotland, March 2024. IEEE Computer Society.
  15. Software-hardware cooperative memory disambiguation. In Proc. HPCA, 2006, pages 244–253, 2006.
  16. Feedback-Directed Memory Disambiguation through Store Distance Analysis. In Proc. ICS ’06, 2006.
  17. MLIR Affine Dialect. https://mlir.llvm.org/docs/Dialects/Affine/.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 5 tweets with 13 likes about this paper.