Job Description :
Big Data Engineer

6 month contract

Mountain View CA


Minimum Degree Required:

Master''s Degree


Skills:

Required

CAPACITY PLANNING

DEBUG

EC2

EMR

HIVE



Additional

INCIDENT MANAGEMENT

KAFKA

OPEN SOURCE

PERFORMANCE TESTING

PRODUCTION ENVIRONMENT

AMAZON ELASTIC COMPUTE CLOUD

APACHE KAFKA

BASH

JAVA

JAVASCRIPT

PYTHON

RUBY

SCALA

SQL

SWITCH CAPACITY



This job is primarily to enable Uber’s Hudi project at Intuit. We would also want to extend the EMRs to use Netflix’s Iceberg. We are looking for someone who can integrate and patch open source big data projects, and make it work for Intuit.



Responsibilities:

Managing Large Scale EMR Clusters in AWS Proficient in writing EMR Spark jobs

Proficient in Java and strong in Scala Ability to integrate and patch big data open source technologies like Uber’s Hudi and Netflix’s

Iceberg to work with EMR In depth understanding of Hive Good understanding of Kafka and writing producers and consumers

Experience in AWS big data technologies like EMR, EMR Consistent View, S3, Athena, Glue Catalog Extensive hands on experience with administering some of the following: Spark, EMR, Hive on Tez or Presto

Ability to debug EMR streaming jobs consisting of Kafka, Spark, and Hive

Operational mindset with ability to do Problem, SLA and Incident Management

Experience installing and managing Kafka is good to have Strong critical thinking ability to assess complex problems, analyze options, navigate diverse perspectives, and develop optimal/acceptable solutions

Ability to work independently with minimal supervision

Contributions to Operational Standards and Requirements

Develop run books for problem diagnosis, resolution, and escalation.



Qualifications

BS or higher in Computer Science or equivalent knowledge and experience MS preferred.

Experience managing infrastructure in AWS using EMR Proficient with automating repetitive tasks using Python, Ruby, JavaScript, or Bash Experience designing and deploying systems on AWS

Enjoy a fast-paced, rapidly changing work environment

Good communication skills, and a passion for clean code

Experience with SQL and Python (boto3 Library)

Experience with AWS products including EC2, S3, RDS Exposure to Big Data on AWS (Data Pipeline, Batch, Glue, S3, EMR/EC2)

Extensive experience in automation, capacity planning, and benchmarking/performance testing in AWS Designing cloud infrastructure that is secure, scalable, and available on AWS Troubleshooting issues and participate in ensuring the stability of the production environment

Strong critical thinking ability to assess complex problems, analyze options, navigate diverse perspectives, and develop optimal/acceptable solutions

Ability to work independently with minimal supervision
             

Similar Jobs you may be interested in ..