Job Description :
Role: DevOps/Site Reliability Engineer with Prometheus and Grafana experience

Location: Remote

Duration: 6+ months contract

Interview: Phone and Skype

Skills/ Must-Haves :
Focus: Most of the candidates have basic Prometheus and Grafana experience which was setup as part of Kubernetes or used it as a tool. We are not looking at usage of the tool for monitoring, but more the designing and implementation experience using custom application metrics and custom Grafana dashboards using Promql and Grafana configurations, query options, and custom plugins.
Primary Skills:
1) Experience with Prometheus
a. Including setup of Prometheus for monitoring – setup of scrape configurations, recording rules, etc.
b. Understanding of PromQL and be able to write PromQL queries
c. Understanding different types of metrics used in Prometheus and their setup
2) Experience with Grafana
a. Experience setting up Grafana dashboards and using the core set of Grafana visualizations
b. Experience using Promql to get data for Grafana viz
c. Experience with using variable in dashboards
d. Experience with alerting in Grafana
Do you have the following?
• Building, testing, and administering highly available RedHat OpenShift Container Platform cluster
• Champion security by injecting it into the existing development workflow, i.e. into every stage of software development
• Supporting the Ansible Scripting
• Assisting application teams with on-boarding to the Openshift platform in areas such as resource requirements, capacity analysis, troubleshooting support
• Developing and automating repeatable tasks
• AWS provisioning, configuration management, storage management, network management, virtualization
• Creating automated CI/CD pipelines and automating all aspects of our infrastructure
• Mentoring team members in continuous delivery practices to grow a strong, cross functional team
• Developing and improving standards for security (via security as code) across a continuous delivery environment and cloud-based production deployments
Must Have
• The DevOps Engineer is responsible for working and collaborating with our software engineering, architecture and operations teams.
• Key focus areas for this role are service quality, continuous improvements through process streamlining and extensive automation.
• The ideal candidate will use their system and networking knowledge to help maintain system health, monitor and design to scale. He/she will be responsible for business systems (including development, test, UAT and production) which include problem management, capacity planning, monitoring, and alert management for services.
• Key focus areas for this role are service availability, resiliency, service quality, capacity management, and cost of ownership, end-user performance, and architectural scalability.
• Build and maintain tools for deployment, monitoring and operations, troubleshoot and resolve issues in our development, test and production environments.
• Ability to convert architecture into infrastructure blueprints and design / implement individual components through automation (scripts, infrastructure as code – e.g. Terraform, Ansible etc.)
• Good hands-on experience eon CI/CD toolkit (Jenkins, GitLab, Sonar, Git, Nexus, Maven etc.)
• Strong background in configuring and deploying services on Linux/Unix
• Experience in integrating various automated testing tools with continuous delivery pipeline (epg. Junit, Cucumber, Selenium, Jasmine, PhantomJS)
• Very good with problem solving and troubleshooting skills.
• Design & build automated solutions including CI & CD pipelines, as well as build, release, and backup/recovery systems
• Understanding of key components of a Microservices architecture including containers, load balancing, distributed cache.
• Experience in build tools such as Gradle, Maven, NPM etc.,
• Designing and implementing software engineering processes, improve efficiency
Good to Have:
• Experience with serverless platforms (e.g. AWS Lambda, Google Cloud Functions)
• Experience using AWS services
• Hands-on experience on setting up virtualized / containerized deploying (Docker and Kubernetes.)
• Expert level with installation/configuration automation and scripting (shell, Python etc.)
• Background in internet service architectures and a deep understanding of underlying technologies: TCP/IP, HTTP, load balancing, web servers, caching, relational & nosql databases
• Understanding of Enterprise Networks, Security and Identity Access Management.
• AWS certification is a plus (SysOps, DevOp)

Similar Jobs you may be interested in ..