Job Description :
Required Skills: DevOps, Java and Site Reliability

Job Details:

Responsibilities:
Design, code, test and deliver software to automate manual operational work
Troubleshoot priority incidents, facilitate blameless post-incident evaluations and ensure permanent closure of incidents
Engage with development teams throughout the life cycle to help develop software for reliability and scale, ensuring minimal refactoring or changes
Identify application patterns and analytics in support of better service level objectives
Design self-healing and resiliency patterns
Design performance tests, identify bottlenecks and opportunities for optimization and capacity demands, and present solutions for continuous improvements
Design best in class monitoring frameworks to accomplish end-to-end flow monitoring and noiseless alerting
Design automated software and product upgrades, change management and release management solutions
Coach or manage teams as applicable
BS/BA degree or equivalent experience in a software engineering discipline
Expert in at least one technology stack designing, coding, testing, delivering software
Expert practitioner in one or more technology domains, may be a cross-domain expert, able to solve complex and mission critical problems within a business or across the firm
Working knowledge of infrastructure components such as routers, load balancers, cloud products, container systems, compute, storage and networks
Excellent debugging and trouble shooting skills
             

Similar Jobs you may be interested in ..