First listed on: 09 June 2020

DUG is a fast-growing HPC service provider with 4 major global centres and regional offices looking for strong Linux DevOps/System Administrators capable of solving complex problems and delivering transformative projects.

We are looking for someone to become part of our global IT team to assist in multiple ongoing engineering projects to help modernize and build out our global HPCaaS, improve service delivery and on-boarding of clients and generally streamlining our operation and systems.

Responsibilities

  • Big-picture Linux topics and projects , such as:
    • Implementing new inventory management system
    • Deploying software defined network configuration
    • Moving system builds from a custom bash templating systems to ansible
    • Automation of user provisioning
  • Improve system reliability, by improving our Linux issue tracking, resolution, and prevention
  • Define the OS requirements and procedures for running DUG software on customer equipment, and assist with those deployments
  • Serve as level-three support for otherwise unresolvable customer issues

Prerequisites

  • Very strong system-level problem solving
  • Strong system programming and automation with one or more of
    • bash
    • python
    • ansible
  • An expert in standard Linux components
    • kernel modules
    • linux networking
    • /proc and /sys tunables
    • strace, gdb, kdump
    • ssh, rsync, ftp, tftp, nfs
    • pxe, dhcp, dns, kickstart
  • Ability to build the Linux kernel and other software from source including making and testing custom patches
  • The tenacity and attention to detail to pursue a challenging, complex issue to its root cause and to drive projects to completion

Desirable

  • Large scale infrastructure experience (10,000’s servers)
  • Experience working with Internationally distributed systems and teams
  • Experience with deep inspection tools like BPF Compiler Collection (BCC)

 




Recent Jobs