The Superfacility Project: Automated Pipelines for Experiments and HPC
SessionData Management
Event Type
State of the Practice Talk
Best Practices
Big Data
TP
TimeTuesday, 17 November 202010:30am - 11am EDT
LocationTrack 7
DescriptionAs data sets from DOE user facilities grow in both size and complexity, HPC facilities face an urgent need for new capabilities to transfer, analyze, store and curate data in order to facilitate scientific discovery. In response, NERSC and ESnet have expanded services and capabilities in support of these workflows. In this talk, we introduce the Superfacility project at LBNL - a framework for integrating experimental and observational research instruments with computational and data facilities at NERSC and ESnet. We will discuss the science requirements driving our technical innovations in data management, workload scheduling, networking, and automation. We will illustrate the impact of this work using examples of teams that are using our systems for real-time experimental data analysis, challenging our infrastructure in new ways. In particular, we will focus on the new ways experimental scientists are accessing HPC facilities and what the future holds for automated data analysis pipelines.