Job Description :
Role: ELK Admin
Location: Pittsburgh, PA
Duration: 12+ Months

Mandatory Technical/Functional Skills: Administration of the ELK Stack - ElasticSearch, Log Stash, Kibana Kafka clustering Administration

Roles & Responsibility:
Troubleshoot high severity issues and manages the incident lifecycle to resolutions
Lead root cause analysis/investigations through identifying, analyzing and remediating service(s) performance and availability issues to ensure maximum service uptime and availability.
Conducting Blameless Post Incident Review is expected.
Engage in and improve the whole lifecycle of services-from inception and design, through deployment, operation and refinement.
Maintain services once they are live by measuring and monitoring availability, latency and overall system health. You’re expected to be on- call and have strong written communication skills and be able to develop working relationships with coworkers.
Solve complex, business critical issues that impact bottom line financial numbers and customer loyalty/experience.
Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
Contribute positively to open source projects developed and join existing communities.
Navigate this broader ecosystem and structure projects with upstream/ downstream opportunities in mind.
Work and collaborate across teams such Application services, Capacity Planning, Hardware, Network, and Datacenter Operations.
Participate in building advanced tooling for testing, monitoring, administration, and operations of multiple clusters across multiple environments.
Experience negotiating SLIs, SLOs, and SLAs with product owners. Perform other related duties as assigned