Job Description :
Big Data Engineer
6 month contract
Mountain View CA
Minimum Degree Required:
Master''s Degree
Skills:
Required
CAPACITY PLANNING
DEBUG
EC2
EMR
HIVE
Additional
INCIDENT MANAGEMENT
KAFKA
OPEN SOURCE
PERFORMANCE TESTING
PRODUCTION ENVIRONMENT
AMAZON ELASTIC COMPUTE CLOUD
APACHE KAFKA
BASH
JAVA
JAVASCRIPT
PYTHON
RUBY
SCALA
SQL
SWITCH CAPACITY
This job is primarily to enable Uber’s Hudi project at Intuit. We would also want to extend the EMRs to use Netflix’s Iceberg. We are looking for someone who can integrate and patch open source big data projects, and make it work for Intuit.
Responsibilities:
Managing Large Scale EMR Clusters in AWS Proficient in writing EMR Spark jobs
Proficient in Java and strong in Scala Ability to integrate and patch big data open source technologies like Uber’s Hudi and Netflix’s
Iceberg to work with EMR In depth understanding of Hive Good understanding of Kafka and writing producers and consumers
Experience in AWS big data technologies like EMR, EMR Consistent View, S3, Athena, Glue Catalog Extensive hands on experience with administering some of the following: Spark, EMR, Hive on Tez or Presto
Ability to debug EMR streaming jobs consisting of Kafka, Spark, and Hive
Operational mindset with ability to do Problem, SLA and Incident Management
Experience installing and managing Kafka is good to have Strong critical thinking ability to assess complex problems, analyze options, navigate diverse perspectives, and develop optimal/acceptable solutions
Ability to work independently with minimal supervision
Contributions to Operational Standards and Requirements
Develop run books for problem diagnosis, resolution, and escalation.
Qualifications
BS or higher in Computer Science or equivalent knowledge and experience MS preferred.
Experience managing infrastructure in AWS using EMR Proficient with automating repetitive tasks using Python, Ruby, JavaScript, or Bash Experience designing and deploying systems on AWS
Enjoy a fast-paced, rapidly changing work environment
Good communication skills, and a passion for clean code
Experience with SQL and Python (boto3 Library)
Experience with AWS products including EC2, S3, RDS Exposure to Big Data on AWS (Data Pipeline, Batch, Glue, S3, EMR/EC2)
Extensive experience in automation, capacity planning, and benchmarking/performance testing in AWS Designing cloud infrastructure that is secure, scalable, and available on AWS Troubleshooting issues and participate in ensuring the stability of the production environment
Strong critical thinking ability to assess complex problems, analyze options, navigate diverse perspectives, and develop optimal/acceptable solutions
Ability to work independently with minimal supervision