SC20 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Emulating I/O Behavior in Scientific Workflows on High Performance Computing Systems

Workshop:Fifth International Parallel Data Systems Workshop

Authors: Fahim Tahmid Chowdhury and Yue Zhu (Florida State University); Francesco Di Natale, Adam Moody, Elsa Gonsiorowski, and Kathryn Mohror (Lawrence Livermore National Laboratory); and Weikuan Yu (Florida State University)

Abstract: Scientific application workflows leverage the capabilities of cutting-edge high-performance computing (HPC) facilities to enable complex applications for academia, research, and industry communities. Data transfer and I/O dependency among different modules of modern HPC workflows can increase the complexity and hamper the overall performance of workflows. Understanding this complexity due to data-dependency and dataflow is an essential prerequisite for developing optimization strategies to improve I/O performance and, eventually, the entire workflow.

In this paper, we discuss dataflow patterns for workflow applications on HPC systems. As existing I/O benchmarking tools lack in identifying and representing the dataflow in modern HPC workflows, we have implemented Wemul, an open-source workflow I/O emulation framework, to mimic different types of I/O behavior demonstrated by common and complex HPC application workflows for deeper analysis. We elaborate on the features and usage of Wemul, demonstrate its application to HPC workflows, and discuss the insights from the performance analysis results on Lassen supercomputing cluster at Lawrence Livermore National Laboratory (LLNL).


Back to Fifth International Parallel Data Systems Workshop Archive Listing

Back to Full Workshop Archive Listing