SC20 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

TOSS-2020: A Commodity Software Stack for HPC

Authors: Edgar A. Leon, Trent D'Hooge, Nathan Hanford, Ian Karlin, Ramesh Pankajakshan, Jim Foraker, Chris Chambreau, and Matthew L. Leininger (Lawrence Livermore National Laboratory)

Abstract: The simulation environment of any HPC platform is key to the performance, portability and productivity of scientific applications. This environment has traditionally been provided by platform vendors, presenting challenges for HPC centers and users; including platform-specific software that tends to stagnate over the lifetime of the system. In this paper, we present the Tri-Laboratory Operating System Stack (TOSS), a production simulation environment based on Linux and open source software, with proprietary software components integrated as needed. TOSS, focused on mid-to-large scale commodity HPC systems, provides a common simulation environment across system architectures, reduces the learning curve on new systems and benefits from a lineage of past experience and bug fixes. To further the scope and applicability of TOSS, we demonstrate its feasibility and effectiveness on a leadership-class supercomputer architecture. Our evaluation, relative to the vendor stack, includes an analysis of resource manager complexity, system noise, networking and application performance.

Back to Technical Papers Archive Listing