Job Description :

Application Architect - SRE Engineer
Phoenix, AZ Chicago IL, Richmond VA (local or within 50 miles)

Responsible for reliability and support of Container PaaS Platform on-prem/off-prem (Azure /AWS /Google)
Monitor and troubleshoot Azure /AWS /Google environment performance issues, connectivity issues, security issues, etc.
Perform deep dives into systemic and latent reliability issues, Incident management, problem management
Identifying, analyzing, and resolving infrastructure vulnerabilities and application deployment issues.
Perform blameless RCA, partner with engineering and operation teams across the organization to roll out fixes.
Identify and drive opportunities to improve automation for the PaaS services; scope and create automation for deployment, management, and visibility of our services.
Evaluating and automating the scaling and capacity requirements within PaaS environments
Partner with risk, and compliance teams to bring visibility and implement right controls and policies in the PaaS Platform
Ensure resiliency during implementation and identify/fix resiliency problems by collaborating with engineering teams
Be a key stakeholder in the design of cloud services and work with Architecture, engineering, product teams
Participate in 24x7 on-call coverage follow the sun model
Primary Skill
Kubernetes
Secondary Skill
RedHat OpenShift
Tertiary Skill
Python
Required Skills
BS /MS degree in Computer Science or related technical field involving systems or equivalent practical experience.
Minimum 5+ years of hands-on experience supporting Kubernetes /Openshift / Container PaaS platform
Experience with Python, Ansible and shell scripting
Kubernetes /Openshift /Terraform certifications are a plus
Strong experience in major services related to Compute, Storage, Network and Security
Experience with monitoring tools like Prometheus and Dynatrace, as well as cloud native tools like Azure Monitor and Log Analytics
Strong understanding and background of working with a complex Active Directory and IAM controls
Advanced knowledge of DNS, DHCP, Kerberos and Windows Authentication
Experience with CI/CD tools git /Jenkins, GitOps model
Excellent understanding of Linux /Windows operating systems administration
Systematic problem-solving approach, sense of ownership and drive
Ability to juggle competing priorities and adapt to changes in project scope.
Excellent interpersonal, organizational and communication (written, verbal, and presentation) skills are a must.
Proven ability to work independently with minimal supervision and as part of a team with direct responsibilities.
Desired Skills
Experience in Openshift, managed Kubernetes services such as AKS, EKS, or GKE
Experience in Terraform, ArgoCD, Tekton, and K-native technologies
Experience in agile deployment methodologies (GitOps)
Knowledge of various container runtimes
Familiarity with the operator deployment pattern.
Experience working in a highly available multi-datacenter environment
Experience working with monitoring tools such as Prometheus, Splunk, Dynatrace, Sysdig, or similar tools.
Understanding of cost management, inventory management, FinOps model

             

Similar Jobs you may be interested in ..