OpenCL-Enabled Parallel Raytracing for Astrophysical Application on Multiple FPGAs with Optical Links
SessionH2RC 2020: Sixth International Workshop on Heterogeneous High-Performance Reconfigurable Computing
Event Type
Workshop
Accelerators, FPGA, and GPUs
Architectures
Emerging Technologies
Heterogeneous Systems
Reconfigurable Computing
W
TimeFriday, 13 November 20205:45pm - 6:15pm EDT
LocationTrack 6
DescriptionWe have optimized the Authentic Radiative Transfer (ART) method to solve space radiative transfer problems in early universe astrophysical simulation on Intel Arria 10 FPGAs as earlier work. In this paper, we optimize it for the latest FPGA -- Intel Stratix 10 and evaluate its performance comparing with GPU implementation on multiple nodes. For the multi-FPGA computing and communication framework, we apply our original system named Communication Integrated Reconfigurable CompUting System (CIRCUS) to realize OpenCL base programming to utilize multiple optical links on FPGA for parallel FPGA processing, and this is the first implementation of real application over CIRCUS.
The FPGA implementation is 4.54 times, 8.41 times, and 10.64 times faster than that of GPU on 1 node, 2 nodes, and 4 nodes, respectively, for multi-GPU cases with InfiniBand HDR100 network. It also achieves 94.2% parallel efficiency running on 4 FPGAs. We believe this efficiency is brought from CIRCUS's low-latency and high-efficiency pipelined communication which provides easy programming on multi-FPGA by OpenCL for high performance computing applications.
The FPGA implementation is 4.54 times, 8.41 times, and 10.64 times faster than that of GPU on 1 node, 2 nodes, and 4 nodes, respectively, for multi-GPU cases with InfiniBand HDR100 network. It also achieves 94.2% parallel efficiency running on 4 FPGAs. We believe this efficiency is brought from CIRCUS's low-latency and high-efficiency pipelined communication which provides easy programming on multi-FPGA by OpenCL for high performance computing applications.