Job Description :
Position:Site Reliability Engineer
Location: Reston, VA
Duration;12+ Months

Required Experience

3 - 5+ years of relevant professional experience;
Experience as a full-stack developer with hands on knowledge of languages like Java, Python etc. and exposure with application / infrastructure architecture;
Excellent verbal and written communication skills with experience presenting information and/or ideas to an audience
Experience collaborating cross-functionally on availability / performance issues in order to identify root-cause, determine areas for improvement, and drive those actions to closure through effective solutions
Adept at managing project plans, resources, and people to ensure successful project completion in an Agile / Scrum environment in order to facilitate the design / development of engineering and resiliency methodologies through collaboration with engineering and product teams to implement shift left techniques on test design & automation
Knowledge of Performance and Chaos Engineering strategies and scripts with a strong emphasis on automated deployment, infrastructure automation solutions, and continuous integration & delivery processes
Ability to identify gaps in the code from a non-functional viewpoint and experience assisting other developers to fix the code and promote relevant reliability pattern implementations
Skilled in establishing and maintaining the overall health, availability, performance, resiliency, and capacity of technology products with specific experience in performance engineering and validations using JMeter, Load Runner, etc.
Skilled in cloud technologies and cloud computing to include Amazon Web Services (AWS) offerings, development, and networking platforms
Experience defining, measuring, and improving Reliability Metrics (SLO/SLI), Observability (Monitoring, Logging-Tracing solutions), Operations Processes (Incident, Problem Management), and Operations Toil Reduction through Automation
Experience designing, building and implementing dashboards from application and infrastructure health perspectives using tools such as Splunk, Dynatrace, Datadog, etc. to provide a single pane view of all critical business and operational information to relevant stakeholders
Excellent analytical and problem-solving skills with a passion to resolve the issues in a timely manner
Strong understanding & knowledge of Java / J2EE technologies & frameworks including UI / JavaScript frameworks, Spring Boot / Spring Cloud Frameworks, REST, Microservices, server-side frameworks
Knowledge on Cloud technologies and containerization using Docker & Kubernetes
Excellent understanding and demonstrated experience in the use of DevOps / CICD tools like Jenkins, Terraform, Jules, and automated deployment tools
Working knowledge of one of Unix operating systems
Knowledge of performance tuning of enterprise level Java / J2EE applications (Web and Application Servers Configuration, JVM parameters tuning, GC and Heap Size, Message Broker);
Experience in implementing resiliency design pattern frameworks and validation
Experience in performance engineering tools – monitoring tools, performance testing tools, and analysis tools
Experience in troubleshooting Performance / Scalability / Availability issues in production environment.
Desired Experience

Bachelor’s Degree or Equivalent
Relevant certifications such as AWS Certified Solutions Architect, AWS Certified SysOps Administrator, Splunk Certified Developer, Dynatrace, Sun Certified Java Programmer, etc.
             

Similar Jobs you may be interested in ..