Workshop:MCHPC’20: Workshop on Memory Centric High-Performance Computing
Authors: Neil Butcher (University of Notre Dame), Stephen Olivier (Sandia National Laboratories), and Peter Kogge (University of Notre Dame)
Abstract: Many-core systems are beginning to feature novel large, high-bandwidth intermediate memory as a visible part of the memory hierarchy. This paper discusses how to make use of intermediate memory when composing multiple matrix operations.
We re-purpose the cache-oblivious approach developed by Frigo and apply it to the composition of a notionally bandwidth-bound kernel (transpose) with a compute-bound kernel (matrix multiply). Particular focus is on regions of matrix shapes far from square that are not usually considered. The resulting example is far simpler than optimized codes, but reasonably close in performance. Also, perhaps of more importance is developing a paradigm for how to construct other codes using intermediate memories.
Back to MCHPC’20: Workshop on Memory Centric High-Performance Computing Archive Listing