SC20 Is Everywhere We Are

Virtual Event FAQ
Postdoctoral Appointee - DataStates
·
Argonne National Laboratoy
·
Lemont, IL
SessionJob Fair
Event Type
Job Posting
Registration Categories
TP
W
TUT
XO
TimeMonday, 9 November 20209am - 8pm EDT
Location
DescriptionPosition Description:
The Exascale Computing Project (ECP) is working closely with large scale scientific applications that are increasingly being driven by scalable deep learning (e.g., CANDLE – Cancer Deep Learning Environment) running on the largest supercomputers in the world. In this context, we develop efficient techniques to capture, manipulate and persist large amounts of data in a consistent and resilient fashion (some of which are illustrated by the VELOC project, a low overhead checkpointing system). Currently, we are exploring a new data model centered around the notion of data states, which are intermediate representations of datasets automatically recorded into a lineage when tagged by applications with hints, constraints and persistency semantics. Such an approach enables the applications to focus on the meaning and properties of their data rather than how to access it, effectively reducing complexity while unlocking high performance and scalability for many use cases: finding and reusing previous intermediate results to explore alternatives, inspecting the evolution of datasets, verifying correctness, etc. This is especially important in the context of deep learning, where there is an acute need for advanced tools that explore many alternative DNN models and/or ensembles to improve accuracy, training speed and ability to generalize/explain a problem.

In addition to addressing such transformative challenges that arise at the intersection of HPC, big data analytics and machine learning, you will have the opportunity to work closely with many domain experts to identify the requirements and bottlenecks of real-life scientific applications that address the needs of our society over the next decades. In general, you will be part of a vibrant and diverse research community from more than 100 countries. Our lab hosts Aurora, one of the first Exascale supercomputers in the world, which you will have an opportunity to use for your experiments. In addition, you will have access to a large array of leading-edge experimental testbeds through the Joint Laboratory for System Evaluation (JLSE), which feature the latest technologies from top vendors like Intel, NVIDIA, AMD, etc.

Position Requirements:
Candidates are required to have earned (or are close to earning) a PhD degree, have a strong scientific background in distributed computing and HPC in particular:

Strong code development skills with C/C++ and Python
Familiarity with modern data management and I/O best practices
Familiarity with machine/deep learning
Candidates should also have familiarity with large scale deep learning techniques: data, model and pipeline parallelism. The ability to conduct interdisciplinary research at the intersection of HPC and deep learning and participate in teamwork and broad collaborative efforts involving other laboratories and universities, supercomputer centers and industry.
·
·
Back To Top Button