Job Description :

Growing enterprise team is seeking a Lead Site Reliability Engineer to own reliability, automation, and observability for large-scale, cloud-native applications in Azure. This role partners closely with software engineering and architecture to improve uptime, performance, and deployment workflows.

Details

  • Title: Lead Site Reliability Engineer

  • Location: DFW, TX – Hybrid (Onsite 2 days ; 3 days remote)

  • Duration: 6-month contract-to-hire

  • Type: W2/C2C through staffing partner

Important fit note (please read before applying)

Please apply only if you have recent, hands-on .NET/C# experience and are open to a detailed technical interview.

Responsibilities

  • Lead efforts to ensure applications and services are highly available, reliable, and performant.

  • Partner with architecture and development teams to design operable, measurable, and supportable systems.

  • Define and drive adoption of SLOs/SLIs, monitoring, alerting, and incident response processes.

  • Work with development teams to diagnose performance issues and production incidents, and lead Root Cause Analysis with durable fixes.

  • Design and implement automation and infrastructure-as-code to reduce manual operational work.

  • Contribute to standards, best practices, and knowledge sharing across engineering teams.

  • Participate in on-call rotation as needed.

Required Skills and Experience

  • 5+ years hands-on experience building and supporting backend services in .NET/C# (mandatory).

  • Strong understanding of SDLC; able to read, review, and improve application code, not just pipelines.

  • Solid experience in Site Reliability Engineering / DevOps roles for distributed systems.

  • Deep experience with Microsoft Azure and cloud-native architectures.

  • Experience with containerization and orchestration (AKS / Kubernetes, Docker).

  • Strong skills with observability and monitoring (e.g., Azure Application Insights, Azure Monitor or similar).

  • Experience defining and tracking SLOs/SLIs and driving incident and problem management.

  • Hands-on experience with Infrastructure-as-Code (Terraform preferred).

  • Strong communication skills and ability to collaborate with developers, architects, and product stakeholders.

Nice to Have

  • Experience setting coding and deployment standards for development teams.

  • Prior team-lead or mentoring experience in SRE/DevOps/Platform roles.

If you meet the .NET/C# requirement and the skills above, please apply with your updated resume and best time for a quick screening call.

             

Similar Jobs you may be interested in ..