Job Description :

Job Title: Site Reliability Engineer

Location: Atlanta, GA

Duration: 12 Months

Note: Selected candidates need to be on board Onsite. They need to pick their on- boarding documents and equipment manually. No expenses will be paid for travelling

Job Description:

The Site Reliability Engineer will work with the systems and network administrators to develop, maintain, and improve the efficiency and reliability of infrastructure and applications.

Design/integrate systems to assist in managing infrastructure and applications

Automate the deployment of new systems using configuration management tools

Continually improve and develop documentation, knowledge base articles, and troubleshooting guides

Improve reliability and efficiency of existing applications, measure and optimize system performance,

continually innovate with a big picture approach

Partner with development teams to improve application testing, monitoring, and release procedures

Participate in system design reviews, platform management, and capacity planning

Create and maintain a high-performance environment characterized by positive leadership and a strong team

orientation

Guide, train, and mentor junior administrators and assist with labs to grow the team

Work with highly confidential data and searches, must maintain confidentiality and security procedures established by the agency

Perform other related duties as assigned

KNOWLEDGE SKILLS & ABILITIES:

Knowledge of open-source configuration, orchestration, and CI/CD tools

Deep understanding of Cloud Architecture and Operations

Strong troubleshooting and debugging skills

Experience handling large numbers of diverse systems with configuration management systems like:

Puppet, Chef, Ansible, or Salt

Understanding of standard networking protocols and components, the OSI Model, routing and switching

Ability to work effectively under pressure and with shifting priorities

Ability to understand and communicate the meaning of data analyses with individuals in a wide range of strategic and operational roles, in both written and oral form

MINIMUM QUALIFICATIONS:

Experience with high level programming languages (Python, Go, Java, etc.)

Experience with shell scripting (BASH, PowerShell, etc.)

Experience working with public and private cloud platforms (e.g., AWS, Google Cloud Platform, Microsoft

Azure, Nutanix, etc.)

Experience with creating and improving documented procedures and/or playbooks

Working knowledge of .NET, Wildfly, JBoss

Strong experience in at least one relational database platform such as Microsoft SQL Server (preferred),

MySQL/MariaDB, Oracle

Experience with network administration and diagramming of network topology

PREFERRED QUALIFICATIONS:

Bachelor's degree in Computer Science or other highly technical, scientific discipline

Experience with tools & technologies such as Prometheus, Grafana, AppDynamics, and Zabbix

Vast knowledge of Windows system troubleshooting, logging, monitoring

Knowledge of the Linux operating system (preferably CentOS)

Experience with container technology such as Docker, LXC, etc.

Knowledge of enterprise/distributed storage technologies

Required/Desired Skills

Skill Matrix

Skill Matrix

Technology

Years of Experience

Overall IT Experience

Communication (1 - 10)

Experience with high level programming languages (Python, Go, Java, etc.)

Experience with shell scripting (BASH, PowerShell, etc.)

Experience working with public and private cloud platforms (e.g., AWS, Google Cloud Platform, Microsoft Azure, Nutanix, etc.)

Working knowledge of .NET, Wildfly, JBoss

Strong experience in at least one relational database platform such as Microsoft SQL Server (preferred), MySQL/MariaDB, Oracle

Experience with tools & technologies such as Prometheus, Grafana, AppDynamics, and Zabbix

             

Similar Jobs you may be interested in ..