pyspark Developer with AWS EMR Experience

Portland, OR Portland OR 97201

Date : Jul-11-16

Portland, OR

Jul-11-16

Work Authorization

US Citizen
GC
H1B
GC EAD

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
1099-Contract
Contract to Hire

Job Details

Experience

Junior, Midlevel, Senior, Architect

Rate/Salary ($)

Market

Duration

12 months

Sp. Area

Cloud, Big Data

Sp. Skills

Amazon (AWS)

Consulting / Contract

H1B Sponsorship Available

Required Skills :

Python Development, AWS EMR, API, Apache, ETL, Performance Tuning, Cluster, Continuous integration, Deployment, Hadoop MapReduce, SDK, SVN, UI, XML

Preferred Skills :

Domain :

Work Authorization

US Citizen
GC
GC EAD
H1B

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
1099-Contract
Contract to Hire

Job Details

Experience

Junior, Midlevel, Senior, Architect

Rate/Salary ($)

Market

Duration

12 months

Sp. Area

Cloud, Big Data

Sp. Skills

Amazon (AWS)

Consulting / Contract

H1B Sponsorship Available

Required Skills :

Python Development, AWS EMR, API, Apache, ETL, Performance Tuning, Cluster, Continuous integration, Deployment, Hadoop MapReduce, SDK, SVN, UI, XML

Preferred Skills :

Domain :

SVK Technology Solutions Inc
Edison, NJ
Post Resume to
View Contact Details &
Apply for Job

Job Description :

Role: pySpark Developer with AWS EMR Experience
Work Location: Portland, OR
Duration: 12 Months

Technical/Functional Skills (Mandatory skills):
5+ years of experience in programming with python. Strong proficiency in python
Familiarity with functional programming concepts
3+ years of Hands on experience in developing ETL data pipelines using pyspark on AWS EMR
Hands on experience of XML processing using python
Good understanding of Spark’s RDD API
Good understanding of Spark’s Dataframe and API
Experience in configuring EMR clusters on AWS
Experience and good understanding of Apache Spark Data sources API.
Experience of dealing with AWS S3 object storage from Spark.
Experience in trouble shooting spark jobs. Knowledge of monitoring spark jobs using Spark UI
Performance tuning of Spark jobs.
Understanding fundamental design principles behind a business processes

Nice to have skills:
Knowledge of AWS SDK CLI
Experience of setting up continuous integration/deployment of spark jobs to EMR clusters
Knowledge of scheduling spark applications in AWS EMR cluster.
Understanding the differences between Hadoop Mapreduce and Apache Spark
Proficient understanding of code versioning tools as Git, SVN

Roles & Responsibilities:
Design, development and implementation of performant ETL pipelines using python API (pySpark) of Apache Spark on AWS EMR
Writing reusable, testable, and efficient code
Integration of data storage solutions in spark – especially with AWS S3 object storage.
Performance tuning of pySpark scripts.
Need to ensure overall build delivery quality is good and on time delivery is done at all times.
Should be able to handle meetings with customers with ease.
Need to have excellent communication skills to interact with customer.
Be a team player and willing to work in an onsite offshore model, mentor other folks in the team (onsite as well as offshore)