Tech Evaluation Login
Tech Evaluation Signup
PasswordReset
Unsubscribe
PasswordReset
PasswordReset
PasswordReset
Register for Webinar
Register for Event
Job Channels
Skill Score
About Us
Contact Us
Our Blog
Toggle navigation
Home
Skill Score
DemandIndex
Jobs
FETCH Jobs
Java J2EE
Dot Net, C#
DBA
SAP
Admin, Networking
Datawarehousing
QA
Demo
Candidate
Recruiter
Services
Job Posting & Resume Access
Integrated Talent Sourcing
Sub Contractor Supply Chain
Merger & Acquisition
Candidate
Post Resume
Login
Create Job Alert
Recruiter
Login
Free Trial
Signup
Pricing
| Pricing
Dashboard
My Resumes
Resume List
Add Resume
Edit / Update Resume
My Jobs
Fetch Jobs
Matched Jobs
Applied Jobs
Sr. Site Reliability Engineer - Incident Response
Atlanta, GA
Atlanta
GA
31156
Date
: Today (Dec-06-25)
2025-12-06
2026-12-06
Sr. Site Reliability Engineer - Incident Response
Atlanta, GA
Today (Dec-06-25)
Work Authorization
US Citizen
GC
H1B
GC EAD, L2 EAD, H4 EAD, TN EAD, OPT EAD, CPT EAD
Preferred Employment
Corp-Corp
W2-Permanent
W2-Contract
1099-Contract
Contract to Hire
Job Details
Experience
:
Midlevel
Rate/Salary ($)
:
Market
Duration
:
Sp. Area
:
AI, ML, NLP, Data Science
Sp. Skills
:
x-Other
Permanent Direct Hire
FULL_TIME
Direct Client Requirement
Required Skills
:
Artificial Intelligence, Machine learning, MEAN.JS, Cloud Computing, Continuous deployment, DevOps, Splunk
Preferred Skills
:
Domain
:
Work Authorization
US Citizen
GC
GC EAD, L2 EAD, H4 EAD, TN EAD, OPT EAD, CPT EAD
H1B
Preferred Employment
Corp-Corp
W2-Permanent
W2-Contract
1099-Contract
Contract to Hire
Job Details
Experience
:
Midlevel
Rate/Salary ($)
:
Market
Duration
:
Sp. Area
:
AI, ML, NLP, Data Science
Sp. Skills
:
x-Other
Permanent Direct Hire
FULL_TIME
Direct Client Requirement
Required Skills
:
Artificial Intelligence, Machine learning, MEAN.JS, Cloud Computing, Continuous deployment, DevOps, Splunk
Preferred Skills
:
Domain
:
Cox
Atlanta, GA
Post Resume to
View Contact Details &
Apply for Job
Job Description
:
The
Site Reliability Engineer - Incident Response
is a critical enterprise-level role responsible for accelerating incident resolution and enhancing the overall incident management process. This individual partners with engineering teams during active incidents to troubleshoot issues using monitoring and logging tools, and post-incident, delivers executive-level summaries that clearly communicate impact, root cause, and resolution. The SRE - Incident Response also plays a key role in analyzing incident response effectiveness and identifying opportunities for systemic improvements.
Core Competencies
Engineering/Tooling: Demonstrates the ability to design, build, and maintain engineering solutions and tools that enhance reliability, automate incident response, and reduce operational toil.
Incident Troubleshooting: Skilled in interpreting logs, metrics, and traces to assist in identifying root causes during live incidents.
Monitoring & Observability: Proficient in tools such as Datadog, Splunk, New Relic, or similar platforms.
AI Centric Engineering: Effectively leverages artificial intelligence (AI) and machine learning (ML) tools to automate, optimize, and enhance daily engineering and incident response tasks
Executive Communication: Ability to distill complex technical issues into concise, business-relevant summaries for senior leadership.
Analytical Rigor: Strong attention to detail in validating incident data and identifying trends or gaps in response.
DevOps & Architecture Knowledge: Understanding full-stack systems, CI/CD pipelines, caching, scaling, and cloud-native infrastructure.
Metrics & Reporting: Capable of calculating and interpreting key metrics like MTTA (Mean Time to Acknowledge) and MTTR (Mean Time to Resolve).
Key Responsibilities of This Role
Here's how it typically looks when not tied to active on-call:
Post-Incident Review Development
Draft and deliver executive summaries post-incident
Develop and coach teams on
blameless postmortems
.
Create templates, train facilitators, and help guide root cause analysis (e.g., 5 Whys, fishbone diagrams).
Maintain a central library of learnings and cross-cutting themes.
Incident Process Improvement
Actively support engineering teams during incidents by helping diagnose and resolve issues quickly
Navigate and analyze data from observability platforms to make informed inferences about root causes
Analyze the effectiveness of incident response to identify systemic reliability gaps.
Standardize incident response workflows (incident roles, comms, escalation paths).
Create or refine
runbooks
,
incident command frameworks
, and
severity classification guides
.
Metrics and Insights
Build dashboards around incident frequency, MTTR, MTTA, and recurrence rates.
Use incident data to drive reliability of OKRs or engineering investments.
Tooling & AI Solutions
Partner with engineering teams to identify repetitive or high-impact tasks suitable for automation.
Develop, implement, and continuously improve custom scripts, bots, and AI-driven workflows for monitoring, alerting, and incident triage.
Evaluate and integrate emerging AI/ML technologies to optimize detection, root cause analysis, and reporting.
Ensure all tools and automations are secure, maintainable, and aligned with organizational standards and SRE best practices.
Document and socialize new tools and AI solutions, enabling adoption and knowledge sharing across teams.
Cross-Team Collaboration
Collaborate with Engineering Managers and Incident Commanders to gather and validate incident data
Partner with product teams, infra, and leadership to
socialize reliability best practices
.
Act as a reliability "consultant" to squads that have impactful incidents.
Recommend enhancements to monitoring, alerting, and response processes to reduce future incident impact
USD 99,000.00 - 165,000.00 per year
Compensation:
Compensation includes a base salary of $99,000.00 - $165,000.00. The base salary may vary within the anticipated base pay range based on factors such as the ultimate location of the position and the selected candidate's knowledge, skills, and abilities. Position may be eligible for additional compensation that may include an incentive program.
Benefits:
The Company offers eligible employees the flexibility to take as much vacation with pay as they deem consistent with their duties, the company's needs, and its obligations; seven paid holidays throughout the calendar year; and up to 160 hours of paid wellness annually for their own wellness or that of family members. Employees are also eligible for additional paid time off in the form of bereavement leave, time off to vote, jury duty leave, volunteer time off, military leave, and parental leave.
Turn OFF keyword highlights
Similar Jobs you may be interested in ..
Senior Databricks
AI
Platform SRE $87/hr
,
Alpharetta, GA
Nov-16-25
VALIANT TECHNOLOGIES LLC
($) :
$87
Role: –Senior Databricks
AI
Platform SREBill Rate: $87/hour C2CLocation: ALpharetta,GADuration: 12+ months/ long-term Interview Criteria: Telephonic + ZoomDirect Client Requirement Job DescriptionWe are looking for a Senior Databricks
AI
Platform SRE to join our Platform SRE team. This role will be critical in designing, building, and optimizing a scalable, secure, and developer-friendly Databricks platform to enable
Machine Learning
(
ML
) and
Artificial Intelligence
(
AI
) workloads at enterpris
Apply
[Apply Individually]
Vertex
AI
Platform Engineer
,
Alpharetta, GA
Nov-20-25
Synkriom
($) :
Market
Job Title: Vertex
AI
Platform Engineer Location: Alpharetta, GA share profile on : Cloud Platform: Google Cloud (Vertex
AI
) Job Summary: We are seeking an experienced Vertex
AI
Platform Engineer to maintain, optimize, and support ourAI/
ML
infrastructure on Google Cloud. The ideal candidate will have hands-on expertise inVertex
AI
services, container orchestration (Kubernetes, Docker), andDevOps automation, ensuring reliable and scalable
machine learning
operations. Key Responsibili
Apply
[Apply Individually]
Senior Cloud
AI
Architect
,
Atlanta, GA
Nov-23-25
JPS Tech Solutions LLC
($) :
Market
Role: Senior Cloud
AI
Architect Location: Atlanta GA- Onsite Interview Type: Either Webcam or In Person Experience: 10+ *MUST be local to Metro Atlanta Short Description: Plans and schedules team resources, status reporting and completion of required documentation for team assignment and projects. Analyzes and develops work or project plans for team assignments. Analyzes and implements solutions. Supervises subordinates. Complete Description: The Senior Cloud and
AI
Architect is an exp
Apply
[Apply Individually]
AI
Engineer
,
Atlanta, GA
Nov-28-25
Robotics technology LLC
($) :
70$/hr
Job Discription: Demonstrate advanced programming expertise, particularly in Python, with deep proficiency in
AI
-centric libraries such as TensorFlow and PyTorch. Architect and implement Retrieval-Augmented Generation (RAG) pipelines to enhance model performance using external knowledge sources. Design, develop, and deploy
AI
agents capable of autonomous decision-making and task execution using LLMs and multi-modal models. Implement and manipulate complex algorithms essential for dev
Apply
[Apply Individually]
Senior
DevOps
Engineer
,
Atlanta, GA
Dec-06-25
Robotics technology LLC
($) :
70$/hr
JOb Overview: Build and maintain internal developer tools and CI/CD platforms Partner with development teams to eliminate bottlenecks in the delivery process Automate build, deployment, and release pipelines using modern
DevOps
practices Implement observability for tooling, pipelines, and services (internal and external) Define and champion best-practice standards for monitoring, alerting, and reporting Qualifications: Proficiency with GitHub, GitHub Actions, and related auto
Apply
[Apply Individually]
Azure
DevOps
Engineer $82/hr Srinivasa K
,
Alpharetta, GA
Dec-06-25
VALIANT TECHNOLOGIES LLC
($) :
$82
Role: –Azure
DevOps
EngineerBill Rate: $82/hour C2CLocation: Alpharetta, GADuration: 12+ months/ long-term Interview Criteria: Telephonic +ZoomDirect Client Requirement The opportunity: ·Manage and automate Azure cloud infrastructure using Terraform. ·Develop reusable and scalable Infrastructure as Code (IaC ·Monitor system performance using
Splunk
, Azure Monitor, and Dynatrace. ·Configure logging tools to analyze logs for troubleshooting andoptimization. ·Build and maintain CI/CD pi
Apply
[Apply Individually]
Azure
devops
Cloud Admin
,
Atlanta, GA
Nov-28-25
Robotics technology LLC
($) :
70$/hr
Job Description:Infrastructure as code IAC using Terraform for azure cloud environmentProven experience with GitHub action for Ci/CD automationDeep understanding azure and azure networkstrong understanding with Kubernetes and hands on experience administering AKS. We are an equal opportunity employer. All aspects of employment including the decision to hire, promote, discipline, or discharge, will be based on merit, competence, performance, and business needs. We do not discriminate on the
Apply
[Apply Individually]
DevOps
Release Manager
,
Atlanta, GA
Nov-24-25
Robotics technology LLC
($) :
70$/hr
Job Discription: Expect T1 background check / CAC card / ALT token / COMPTIA Security+ certification Strong Salesforce Admin with Release Management skills Demonstrated experience in a Salesforce Release Management /
DevOps
role Experience deploying, promoting, and managing enterprise applications on the Salesforce platform Expert-level knowledge and hands-on experience with the Flosum platform 5+ years of developing and managing
DevOps
solutions for the Salesforce platform Expe
Apply
[Apply Individually]
DevOps
Engineer
,
Atlanta, GA
Dec-04-25
JPS Tech Solutions LLC
($) :
Market
Job Title:
DevOps
Engineer (Jenkins & Kubernetes) Location: Atlanta, GA 30309 Employment Type: Contract About the Role We are looking for a passionate
DevOps
Engineer with strong hands-on experience in Jenkins CI/CD and Kubernetes orchestration. In this role, you will be responsible for building and maintaining automated pipelines, managing containerized applications, and improving deployment efficiency across development and production environments. Key Responsibilities Design, build, and opt
Apply
[Apply Individually]
Devops
Engineer(API Management)
,
Atlanta, GA
Dec-06-25
Robotics technology LLC
($) :
70$/hr
Job Discription: Also, please ensure all candidates have strong experience in 3Scale API Management Administration — this is the primary requirement. The selected candidate will be reporting to the OpenShift Infrastructure Lead. Required Skills and Experience: · Prior hands-on experience in API management platforms, such as API Gateway and API related activities such as: requirement analysis, solution architecture and design, building API policies. - 3 Years · Prior hands-on-experie
Apply
[Apply Individually]