Papers
Topics
Authors
Recent
Search
2000 character limit reached

High-Performance Code Generation though Fusion and Vectorization

Published 24 Oct 2017 in cs.PF and cs.DC | (1710.08774v1)

Abstract: We present a technique for automatically transforming kernel-based computations in disparate, nested loops into a fused, vectorized form that can reduce intermediate storage needs and lead to improved performance on contemporary hardware. We introduce representations for the abstract relationships and data dependencies of kernels in loop nests and algorithms for manipulating them into more efficient form; we similarly introduce techniques for determining data access patterns for stencil-like array accesses and show how this can be used to elide storage and improve vectorization. We discuss our prototype implementation of these ideas---named HFAV---and its use of a declarative, inference-based front-end to drive transformations, and we present results for some prominent codes in HPC.

Citations (2)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.