Job Description :
Role: Big Data Developer
Location: McLean, VA
Duration: 6+ months
Interview Mode: Phone and Skype

Required

5+ years of experience in processing large volumes and variety of data (Structured and unstructured data, writing code for parallel processing, XMLs, JSONs, PDFs)
Hadoop
Python
Pyspark
Spark
3+ years of programming experience in Python, Spark for data processing and analysis
Strong SQL is a must
3+ years of experience using Hadoop platform and performing analyses. Familiarity with Hadoop cluster environment and configurations for resource management for analysis work
Requires excellent communication skills both verbal and written
Must be able to manage multiple priorities and meet deadlines
Bachelor’s Degree in Statistics, Economics, Business, Mathematics, Computer Science or related field

Responsibilities:

Cleanse, manipulate and analyze large datasets (Structured and Unstructured data -XMLs, JSONs, PDFs) using Hadoop platform
Develop Python, Pyspark, Spark scripts to filter/cleanse/map/aggregate data
Manage and implement data processes (Data quality reports)
Develop data profiling, deducing logic, matching logic for analyses
Programming languages experience in Python, Pyspark and Spark for data ingestion
Programming experience in BigData platform using Hadoop platform
Present ideas and recommendations on Hadoop and other technologies best use to management