Site Reliability Engineer (SRE)-Direct Client Need
Location : Southfield ,MI
Duration : 6 Plus months to Hire
Note : Must work on Site from day one.
Job Description:
At Road Ready from the client Technologies, we’re passionate about building software that solves problems. We count on our site reliability engineer (SRE) to empower users with a rich feature set, high availability, and stellar performance level to pursue their missions. As we expand customer deployments, we’re seeking an experienced SRE to deliver insights from massive-scale data in real time. Specifically, we’re searching for someone who has fresh ideas and a unique viewpoint, and who enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences for every interaction.
Objectives of this role:
• Run the production environment by monitoring availability and taking a holistic view of system health.
• Build software and systems to manage platform infrastructure and applications.
• Improve reliability, quality, and time-to-market of our suite of software solutions.
• Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement.
• Provide primary operational support and engineering for multiple large-scale distributed software applications.
Responsibilities:
• At day-to-day level, SRE’s will be focused on Automation, Monitoring, Incident
Resolution and Culture.
• A love of SRE, open-source, self-service tools, and micro-services.
• Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding.
• Partner with development teams to improve services through rigorous testing and release procedures.
• Participate in system design consulting, platform management, and capacity planning.
• Create sustainable systems and services through automation and uplifts.
• Balance feature development speed and reliability with well-defined service-level objectives.
• After incidents, document actions to create automated solutions during incident response.
• Monitor infrastructure using SRE tools and suggest tools as necessary.
• Build monitoring alerts and incident response processes.
• Improve operational processes and team practices.
• Coding infrastructure automation across the CI/CD pipeline.
• As the solution scales, ensure reliability through designing, building, and maintaining the core infrastructure.
• Demonstrate strong programming skills and thorough knowledge of systems.
• Bring about cultural shifts to provide a foundation for process changes.
• Experience with AWS multi-region/multi-AZ deployed systems, auto scaling of EC2 instances, CloudFormation, ELBs, VPCs, CloudWatch, SNS, SQS, S3, Route53, RDS, IAM roles, security groups, blue/green deployments, and A/B testing.
Required skills and Qualifications:
• Bachelor’s degree (or equivalent) in computer science or related discipline
• Comfortable with large scale production systems and technologies, for example load balancing, monitoring, distributed systems, and configuration management
• Strong coding skills in at least one programming language, and a desire to pick up more.
• Familiarity with and enthusiasm for software engineering best practices such as testing, continuous integration and continuous delivery.
• Exposure with cloud and Amazon Web Services (AWS) and APIs
• The ability to thrive in a rapidly evolving, globally distributed environment.
• Strong Security mindset.
• Proactive approach to identifying problems, performance bottlenecks, and areas for improvement.
• Solid understanding of fundamental technologies like TCP/IP, HTTP.
• Strong working knowledge of Linux systems and applications.
• Experience with automation tooling such as Chef, Docker, AWS.
• Experience with JavaScript Frameworks, Angular JS/ReactJS/NodeJS and with cloud automation/orchestration technologies.
• Ability and willingness to collaborate.
• Strong problem-solving skills and ability to think under pressure.
• Strong analytical skills and management skills.
• Communication and documentation skills.
Preferred skills and qualifications
• Previous success in technical engineering
• Coding experience beyond simple scripts