Iris: Allocation Banking and Identity and Access Management for the Exascale Era
Accelerators, FPGA, and GPUs
Reliability and Resiliency
System Software and Runtime Systems
TimeTuesday, 17 November 20204pm - 4:30pm EDT
DescriptionWithout a reliable and scalable system for managing authorized users and ensuring they receive their allocated share of computational and storage resources, modern HPC centers would not be able to function. Exascale will amplify these demands with greater machine scale, more users, higher job throughput and ever-increasing need for management insight and automation throughout the HPC environment. When our legacy system reached retirement age, NERSC took the opportunity to design and build Iris not only to meet our current needs, with 8000 users and tens of thousands of jobs per day, but also to scale well into the exascale era. In this paper, we describe how we have designed Iris to meet these needs, and discuss its key features as well as our implementation experience.