| Job Description | We are seeking a highly skilled Service Engineer having 7 to 10 Years of experience with strong infrastructure expertise and a deep understanding of High-Performance Computing (HPC) environments. The ideal candidate will have proven experience in managing complex systems, driving third-party application integration, and ensuring seamless operations across distributed computing platforms. This role requires exceptional problem-solving ability, strong communication skills, and a proactive approach to onboarding and supporting HPC workloads. |
| Required Skills & Qualifications | -
Development Experience: Minimum 3 years in software development (PowerShell, Azure Bash, Go or Python). -
Administration Experience: 7+ years in system administration with a focus on infrastructure in PLM and HPC environments. -
Very Strong Communication (Written and Oral), Collaboration, driving skills. -
Proactively managing the backlog and bringing efficiencies through solution design and development. Preferred Attributes -
Passion for HPC technologies and infrastructure optimization. -
Ability to learn and adapt to new tools and frameworks quickly. -
Strong analytical and problem-solving mindset. -
Collaborative approach with cross-functional teams. |
| Key Responsibilities | -
Drive integration and operational support for third-party applications on HPC and PLM infrastructure. -
Manage and maintain Windows Server and Linux OS environments with VM and VMSS. -
Configure and optimize HPC schedulers such as Windows HPC Pack, OpenPBS, Slurm, or similar. -
Support distributed computing workloads, including MPI-based applications. -
Troubleshoot and optimize networking components, focusing on TCP/IP, name resolution, and RDMA. -
Implement and manage Azure Cycle Cloud and Azure Batch for HPC workload orchestration. -
Collaborate with application owners to understand infrastructure dependencies and optimize performance. -
Ensure compliance with security and operational best practices across all systems |