Job Description :
s a Senior Site Reliability / DevOps Engineer in Enterprise Data Solution (EDS), with your strong technical chops and to help our platform and services establish a true SRE capability. You will be partnering with the data engineering team, so the ability to influence and provide operational guidance is key. Initially, the SREs focus will be contributing to the development of operational tools and practices that help maintain service availability across hosted and cloud-based infrastructure. You must have an understanding of the full stack and how systems are built as well as a grasp of operational best practices.



Role

Partner on the design the next implementation of Client secure, global data and insight architecture, building new Stream processing capabilities and operationalizing “Unified Data Acquisition and Processing (UDAP) platform”
Identify and resolve performance bottlenecks either proactively
Work with the customer support group as needed to resolve performance issues in the field
Explore automation opportunity and develop tools to automate some of the day to day operations tasks
Provide performance metrics and maintain dashboards to reflect production systems health
Conceptualize and implement proactive monitoring where possible to catch issues early
Experiment with new tools to streamline the development, testing, deployment, and running of our data pipelines.
Work with cross functional agile teams to drive projects through full development cycle.
Help the team improve with the usage of data engineering best practices.
Collaborate with other data engineering teams to improve the data engineering ecosystem and talent within Client.
Creatively solve problems when facing constraints, whether it is the number of developers, quality or quantity of data, compute power, storage capacity or just time.
Maintain awareness of relevant technical and product trends through self-learning/study, training classes and job shadowing.


All About You

At least Bachelor''s degree in Computer Science, Computer Engineering or Technology related field or equivalent work experience
5+ years of experience in Data Warehouse related projects in product or service based organization
3+ years as a Site Reliability Engineering or DevOps Engineer
2+ years of experience overall with experience as a software engineer or software architect
Experience solving for scalability, Performance and stability
Expert knowledge of Linux operating systems and environment and Scripting (Shell and Python preferred)
A deep expertise in your field of Software Engineering
Expert at troubleshooting complex system and application stacks
Operational Experience in Big Data Stacks ( Hadoop ecosystem, Spark is a plus)
Operational Experience in real-time ,streaming and data pipelines relevant frameworks ( Kafka and NiFi is a plus)
Operational experience troubleshooting network/server communication
Experience with performance Tuning of Database Schemas, Databases, SQL, ETL Jobs, and related scripts
Expertise in enterprise metrics/monitoring with frameworks such as Splunk, Druid, Grafana
Experience with cloud computing services, particularly deploying and running services in Azure or GCP
A belief in data driven analysis and problem solving and a proven track record in applying these principles
An organized approach the planning and execution of major projects
             

Similar Jobs you may be interested in ..