SC20 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Rocket: Efficient and Scalable All-Pairs Computations on Heterogeneous Platforms

Authors: Stijn Heldens (Netherlands eScience Center, University of Amsterdam); Pieter Hijma (Vrije University Amsterdam, University of Amsterdam); Ben van Werkhoven and Jason Maassen (Netherlands eScience Center); Henri Bal (Vrije University Amsterdam); and Rob van Nieuwpoort (Netherlands eScience Center, University of Amsterdam)

Abstract: All-pairs compute problems apply a user-defined function to each combination of two items of a given data set. Although these problems present an abundance of parallelism, data reuse must be exploited to achieve good performance. Several researchers considered this problem, either resorting to partial replication with static work distribution or dynamic scheduling with full replication. In contrast, we present a solution that relies on hierarchical multi-level software-based caches to maximize data reuse at each level in the distributed memory hierarchy combined with a divide-and-conquer approach to exploit data locality, hierarchical work-stealing to dynamically balance the workload and asynchronous processing to maximize resource utilization. We evaluate our solution using three real-world applications, from digital forensics, localization microscopy and bioinformatics, on different platforms, from desktop machine to a supercomputer. Results shows excellent efficiency and scalability when scaling to 96 GPUs, even obtaining super-linear speedups due to a distributed cache.

Back to Technical Papers Archive Listing