Job Description :
BS/MS degree in Computer Science, Engineering, Applied Mathematics or a related field or equivalent experience
5+ years of hands-on programming expertise in Scala, Java, SQL, Python
3+ years of experience with large datasets in Hadoop (Cloudera) and Spark Ecosystem
Hands-on experience in Hadoop data storage, data stores (HBase, Cassandra), and tools (Oozie, Sqoop, Flume etc)
Well versed in Cloudera (CDH 5.x) to manage security, metadata, lineage, job management, Optimizer, Record Service etc
Expertise in kafka (distributed logs) and Spark streaming architecture and development
Experience in design and development of SQL on Hadoop applications (Spark SQL, Impala) and Query Optimization
Troubleshoot, tune, and accelerate data pipelines, data queries, and real time streaming events.
Passionate, self-motivated and willingness to learn

Nice to Have
Expertise in leading cloud technologies like Amazon Web Services
Certification in Hadoop and Spark a plus

Complete Description
Our ideal candidate will have strong programming skills in Scala, Java, Python and SQL as well as expertise in statistical algorithms for data analysis. The candidate should also possess a background in distributed applications, data warehousing and databases with a high proficiency in the Hadoop ecosystem, and the Spark data stack.

Responsibilities
Big Data solution architecture, design, development and delivery of production grade solutions
Hands-on expertise with various big data technologies and the ability to lead an agile delivery team
Measure the performance of data solutions, diagnose bottlenecks and utilize tools to monitor and tune performance
Deploy flexible, scalable, and resilient data solutions to meet evolving client data product requirements
Troubleshoot, tune and accelerate data pipelines, data queries, and real time streaming events

Client : Direct Client