Hi
Hope you are doing well !!
I have an urgent position. Kindly go through the Job description and let me know if this would be of interest to you.
Title : Observability (SRE) Lead (Hybrid)
Duration : 6 Months
Location : Atlanta, GA (Need Local Candidates)
About Job
Dashboarding & Alerting
- Proficient in observability tools - OpenSearch, CloudWatch, Cloudflare, LogRocket, Catchpoint, Firebase, Google Analytics Design and maintain real-time dashboards with the above observability tools
- Create actionable alerts with well-calibrated thresholds and ownership assignments
- Continuously refine alert noise levels to reduce false positives and alert fatigue.
Tool Ownership & Integration
- Lead implementation and optimization of the usage and applicability of the tools -OpenSearch, CloudWatch, Cloudflare, LogRocket, Firebase, Google Analytics
- Ensure proper instrumentation, tagging, and logging across services and APIs
- Integrate dashboards and alerting with tools like Opsgenie or equivalent
- Monitor quality and consistency of telemetry data ingestion to the Observability tools
- Own documentation, standards, and health checks for log formats, trace identifiers, and metrics collection.
Incident Analysis & Reporting
- Participate on major incident triage and troubleshooting management
- Provide observability inputs during post-incident reviews (e.g., time-to-detect, alerting effectiveness)
- Produce weekly or monthly health reports showing trends and proactive insights.
Leading
- Lead a small team of engineers or analysts focused on observability operations and tooling
- Create stories, for observability related initiatives
- Participate continuously refining observability KPIs (e.g., latency, error rate, request rate, availability)
- Collaborate with engineering team to align observability KPIs with business objectives.
Required Skills and Experience:
- 5+ years of experience in working with CloudWatch, OpenSearch, Cloudflare, Firebase, LogRocket, and Google Analytics
- Strong understanding of distributed systems telemetry: logs, metrics, traces
- Experience creating and managing dashboards on the above tools
- Proficient on AWS cloud architecture, microservices, and serverless monitoring strategies
- Strong communication and documentation skills
If you are interested, please share your updated resume and suggest the best number & time to connect with you