Visual Data Management at NERSC
State of the Practice Talk
TimeTuesday, 17 November 202010am - 10:30am EST
DescriptionWrangling data at a scientific computing center can be a major challenge for users, particularly as data volumes continue to grow. The National Energy Research Scientific Computing Center has roughly 60 PBs of shared storage utilizing more than 2.2B inodes, and a 150 PB high-performance tape archive, all accessible from the Cori supercomputer. To help manage exponentially increasing data volumes, we have designed and built a “Data Dashboard”, a web-enabled visual application where users can easily review and manage their data. We are also developing a “PI Toolbox” to allow users to directly control the permissions of their files and directories as well as a “PB Data Portal” to facilitate sharing large volumes of scientific data. We describe the process for developing our tools, the framework supporting them, and the challenges for such a framework moving into the exascale age.