Job Description :
VDart We are a Global Information Technology Services & Workforce Solutions firm headquartered out of Atlanta, GA with presence in US, Canada, MX, UK, Belgium, Japan & India. Founded in 2007, Our team of over 2550+ professionals continually create impact for our customers worldwide in solving complex technology challenges with cutting edge technologies. We specialize in providing the Fortune 1000 companies, niche hard to find skills in technologies including Social, Mobile, Big Data Analytics, Data Sciences, Cyber Security, IoT, Cloud, Machine Learning, and Artificial Intelligence. With delivery centers in the UK, Mexico, Canada, and India, we provide global workforce solutions to our customers covering EMEA, APAC & Americas. VDart is an award-winning organization recognized by Inc 5000 Hall of Fame; Atlanta Business Chronicle*s Fastest Growing Companies; NMSDC*s National Supplier of the Year; Ernst & Young*s Regional Entrepreneur of the Year and more. Title: Site Reliability Engineer Location: Herndon VA (Remote: Yes, until COVID ends) Hire Type: Contract Skills: Java/J2EE, web servers (Apache Tomcat, IBM HTTP Server), Cloud, AWS, GCP, Azure, OpenShift, DevOps, CICD Technical Skills 5+ years of experience on Java/J2EE technologies including one of web servers (Apache Tomcat, IBM HTTP Server), one of the application servers (WebSphere/Weblogic/JBoss), one of the databases (Oracle/SQLServer/DB2) Strong understanding and knowledge of Java/J2EE technologies and frameworks UI/JavaScript frameworks, Spring Boot/ Spring Cloud Frameworks, REST, Microservices, serverside frameworks Experience in working with cloud/cloud platforms AWS, GCP, Azure, OpenShift, PCF Excellent understanding and demonstrated experience in the use of DevOps/CICD tools like Jenkins, Jules and Automated deployment tools Working knowledge on one of Unix operating systems Knowledge on Cloud technologies and containerization using Docker & Kubernetes Automation experience with Ansible play books and programming languages like Java, Perl, Python or PowerShell Scripting and Ansible play book Knowledge on performance tuning of enterprise level Java/J2EE applications (Web and Application Servers Configuration, JVM parameters tuning, GC and Heap Size, Message Broker) Experience in implementing resiliency design patterns using Hystrix, Service Mesh or similar frameworks and validation using chaos monkey type frameworks Excellent knowledge on at least one tool in each of the following category Profiling Jprofiler/ Dynatrace Monitoring - Wily Introscope/AppDynamics/DynaTrace/Splunk/Cloud Watch/Stack Driver Analysis HP Diagnostics / GC log Analysis/ Thread Dump Analyzer / Heap Analyzer Performance testing - Load Runner/Silk Performer/Jmeter/NeoLoad Experience in trouble shooting Performance / Scalability / Availability issues in production environment Experience in Performance Test Modeling Experience in Capacity Planning Ability to come up with solutions using technical knowledge and tools Job Description Work with application stakeholders and define non-functional requirements covering performance, scalability, availability, resiliency and reliability including Service Level Objectives, Service Level Indicators and Error Budgets Develop strategies to address the Non-functional requirements throughout Software or Product Development Life Cycle Work with architecture and development teams in creating performant, highly resilient and reliable architecture and design Work architecture and development teams in implementing resiliency constructs, develop optimal code Work with QA to validate and certify if performance, scalability, availability, resilience and reliability requirements are met Develop tools and utilities to automate manual operational tasks in production Responsible for the performance, scalability, availability, resilience, monitoring, and capacity management of the applications/services in production Responsible for incidents related to NFRs, updating SOPs to capture right set of metrics/logs for RCA, Root cause analysis of the incidents, Solutions identification and Ensure permanent closure of the incidents. Analyze production utilization and incidents patterns, identify improvement areas and implement automation to improve productivity, avoid manual tasks and recurring incidents. Strong communication and presentation skills with emphasis on executive communication Ability to learn and apply new technologies quickly. Eagerness to learn new things If your skills match our requirements, Click here to Apply - . Be sure to reference the job number and title in the subject line. Referral Program: Ask our recruiting team about how you can be a part of our referral program. If you refer a candidate with this background and if the candidate accepts the role our team pays a generous referral. We are keen on networking and establishing a long-term, mutually beneficial partnership with you. We are Equal Employment Opportunity Employer. VDart Inc Alpharetta, GA Follow us on Twitter for the hottest positions: @VDart_Jobs Follow us on Twitter: @vdartinc