SC20 Is Everywhere We Are

Virtual Event FAQ
Kubernetes Platform Systems Engineer
·
Oak Ridge National Laboratory
·
Oak Ridge, TN
SessionJob Fair
Event Type
Job Posting
Registration Categories
TP
W
TUT
XO
TimeMonday, 9 November 20209am - 8pm EDT
Location
DescriptionOverview:



The National Center for Computational Sciences (NCCS) at the Oak Ridge National Laboratory (ORNL) is seeking highly qualified individuals to play a key role in improving the security, performance, and reliability of the NCCS computing infrastructure. The NCCS is a leadership computing facility providing high performance computing resources for tackling scientific grand challenges.



The Team



The Platforms group is tasked with architecting and running our Kubernetes platform called Slate which provides a service to NCCS users and staff to develop, manage, and deliver their own applications that integrate with NCCS HPC resources.



We strive to provide the best Kubernetes service for both our internal staff as well as our scientific users. We achieve this goal in part by dogfooding and we use Kubernetes to run all of our own internal services we support. We have great opportunities to work with other staff helping them develop their applications on the platform as well as working with our outstanding scientific community as we bring Kubernetes to the HPC world.



We are at the intersection of container orchestration and HPC, come help us build the bridge.



About you



We are looking for an experienced systems engineer who can code and focus on customer success. You handle infrastructure with code because automation lets you focus on the more difficult and rewarding problems. You love collaboration with others and coming up with the best solution to the problem. You enjoy and can pick up a new technology quickly. You love CI/CD and GitOps. You probably have production experience with Kubernetes and Golang. You may have a GitHub account with cool projects. You may have technical leadership experience.



Tools we use: Kubernetes, OpenShift, Helm, Prometheus, RHEL, GitLab CI, Terraform, Puppet, Python, Golang



Responsibilities

Participate in an on-call rotation for off-hours support
Keeping the Kubernetes platform reliable, available and fast
Architecting solutions to problems that improve the reliability, scalability, performance and efficiency of our services
Respond to, investigate, and fix service issues all the way from bare metal through the OS to the application code
Design, build, and maintain the infrastructure we need to support the NCCS
Work with our users to help them use Kubernetes
Write awesome documentation


Basic Qualifications:

A Bachelors degree in a scientific field and 2-4 years of relevant experience or equivalent experience.
At least four years experience as an SRE/Sysadmin/Systems engineer


Preferred Qualifications:

Experience with Kubernetes, OpenShift, Helm, Prometheus, RHEL, Puppet, Python, Golang


Special Requirements



Q or L clearance: This position requires the ability to obtain and maintain a clearance from the Department of Energy. As such, this position is a Workplace Substance Abuse (WSAP) testing designated position. WSAP positions require passing a pre-placement drug test and participation in an ongoing random drug testing program.



Relocation: Moving can be overwhelming and expensive. UT-Battelle offers a generous relocation package to ease the transition process. Domestic and international relocation assistance is available for certain positions. If invited to interview, be sure to ask your Recruiter (Talent Acquisition Partner) for details.

For more information about our benefits, working here, and living here, visit the “About” tab at jobs.ornl.gov.

ORNL is an equal opportunity employer. All qualified applicants, including individuals with disabilities and protected veterans, are encouraged to apply. UT-Battelle is an E-Verify employer.
·
·
2020-10-21
Back To Top Button