Job Description :
Title: Enterprise HPC Systems Administrator Duration: 1 Year + Location: Austin, Texas Job Description Responsible for administering and implementing Linux High Performance Computing (HPC) clusters. Performs system administration duties on a Linux High performance computing (HPC) cluster including cluster management, virtualization, cluster usage monitoring, health monitoring, job scheduling, application integration/installation (open source as well as vendor supported), and application performance. Improve cluster performance through kernel changes, firmware updates, library stack changes, and application container management such as docker. Statement of Duties and Responsibilities: Performs system administration duties on a linux HPC Cluster, cluster management, virtualization, cluster usage monitoring, health monitoring, job scheduling, and application integration/installation. Responsible for system implementation/integration and systems performance analysis. Manages hardware and software applications in the production environment provided to HPC users. o Install software and updates o Coordinates with vendors to resolve hardware and software problems in HPC Cluster. o Facilitates the acquisition of hardware and software products and services for the HPC Cluster. o Knowledge of commercial CAE software such as Ansys and ESI products a plus. o Knowledge of SLURM or other open source job schedulers a plus. o Compile, configure, and integrate open source applications into HPC environment. o Able to learn and use internal software systems. Monitors the availability of patches and updates and evaluates the importance to the environment and schedules installations accordingly. Keeps abreast of the latest HPC hardware and software technology, evaluating technologies as needed. Designs, implements and administers high performance computing cluster, performing proof of concepts such as software containers (ex. Docker Interacts effectively with a broad range of colleagues such as Applied Materials researchers and other IT staff. Other duties may be assigned. Minimum Experience required: Knowledge of Linux and UNIX operating systems, including scripting and programming proficiencies. Demonstrate experience in programming system maintenance tasks in C, Java, Perl, batch/shell, or another general purpose programming language. Knowledge of NUMA and understanding of NUMA related APIs. Be able to perform complex performance analysis including system processes, I/O subsystems, networks and other related components. Must have experience with multi-threading and parallel processing tools and environments. Must have experience as a systems administrator. Must have advanced ability to analyze, design and architect complex IT systems. Experience with high-performance servers and associated high-performance networks. Experience installing and maintaining clustered environments, including automated installation methods. Knowledge of common server hardware architectures including servers (CPU, bus, memory), SANS, disk arrays, network hardware. Understanding of Red Hat Linux Operating system including processes, files, memory management and I/O systems; networking services and protocols (e.g., TCP/IP, SSL, FTP, Telnet, LDAP Understanding of IP networking, basic routing, TCP ports and network services, including SSH, LDAP, SFTP and HTTP(S Ability to design, promote, and implement change control and configuration management, patch management, high availability systems, structured design and support methodologies. Must be organized with a strong ability to deliver tasks on time, manage multiple efforts and be able to work with minimal supervision. Demonstrated ability to proactively learn, adapt to and use new hardware/software technologies