SC20 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Toward Automated Kernel Fusion for the Optimization of Scientific Applications


Workshop:LLVM-HPC2020: The Sixth Workshop on the LLVM Compiler Infrastructure in HPC

Authors: Andrew Lamzed-Short (University of Warwick); Timothy Law (Atomic Weapons Establishment (AWE), UK); Andrew Mallinson (Intel Corporation); Gihan Mudalige (University of Warwick); and Stephen Jarvis (University of Birmingham)


Abstract: We introduce a novel transformation pass written using LLVM that performs kernel fusion. We demonstrate the correctness and performance of the pass on several example programs inspired by scientific applications of interest. The method achieves up to 4x speedup relative to unfused versions of the programs, and exact performance parity with manually fused versions. In contrast to previous work, it also requires minimal user intervention. Our approach is facilitated by a new loop fusion algorithm capable of interprocedurally fusing both skewed and unskewed loops in different kernels.





Back to LLVM-HPC2020: The Sixth Workshop on the LLVM Compiler Infrastructure in HPC Archive Listing



Back to Full Workshop Archive Listing