Job Description :

Job Title:  Operations Manager V

Location:  Charlotte, NC

Start Date:  ASAP

Est. End Date: 12/31/24

onsite in Charlotte minimum 3 days a week. ** Systems Operations Manager V **Strong analytical skills - able to triage an issue - own a problem through resolution – prior helpdesk experience would help ** AWS – applied cloud experience;** Dynatrace & Splunk;. **Must have excellent communication skills and work well independently – self-starter and directed, able to lead and drive ambiguous data to root cause analysis, also able to work collaboratively. ** Video conference interview/technical screen.

Description: 

IT Production Operations team is looking for an experienced technician who will serve as a technical lead focusing on operational stability by driving IT operations readiness through the continuous improvement in our products. This role will involve working closely with architecture, development teams and business partners, coaching junior engineers, and implementing enhanced monitoring and alerting capabilities for our distributed platforms. Additionally, will aid in the triage of major incidents and will take ownership of problem management activities driving deep root cause analysis and corrective action. The ideal candidate will have good to significant experience managing production environments. We are looking for a high energy, team player with an innovative mindset interested in joining a group of IT professionals dedicated to enhancing IT operations. This position will report to a Director of Production Operations. Passion for technology and problem solving are a must have.

Collaborates with Agile squads/developers, other production operations and business partners and provides significant contributions to develop specifications to resolve problems, and to address enhancement needs focusing in areas of logging, monitoring and metrics for operational readiness

· Uses technical knowledge, creativity, and company practices and to drive down occurrences of incidents through development of proactive alerting and monitoring.

· Provide continuous feedback to development teams on system stability, defect analysis and system enhancements

· Contibutes to runbooks and patterns to sustain applications in a production environment

· Serves as a mentor to junior IT Engineers

· Triages and Analyzes runtime problems to isolate root cause and resolution

· Participates in technical discussions and drives operations readiness activities with the development teams, 3rd party service providers and business partners.

· Lead RCA and SWAT investigations

· Provides guidance in resolving performance related issues and recommending technical solutions.

 

   Skills:   

 

· Holds BS (preferably MS) in Computer Science or related field. · 5+ years of experience in a similar role and good knowledge of triage related processes · Shows deep knowledge and understanding of enterprise-scale platforms and architectures · Possesses strong analytical, problem-solving skills and exhibits strong leadership skills · Experience with Co-ordination between upstream applications to resolve incidents · Grasps new technologies and can adapt to rapid shifts in priorities · Applied AWS/Cloud experience preferred · Applied experience with as many of the following as possible: Unix and Windows platforms, Java EE, JavaScript, Spring, Spring Boot, REST API/Micro Services, Shell Scripting, Python, SQL and databases, specifically Oracle · Previous DevOps experience with tools such as Python, Terraform, Jenkins, Gitlab, docker preferred · Experience with Ansible Tower or other automation tools are a bonus · Experience with Dynatrace, Splunk or other similar monitoring tools creating dashboards, alerting and reports · Correlate environment conditions and metrics to application events · Experience debugging problems in on-prem/could/hybrid distributed system

             

Similar Jobs you may be interested in ..