Job Description :
Role: Lead Hadoop Developer

Location: NYC, NY

Duration: 12+ months



Job description:

· Total of 10+ years of experience in BI & DW with at least 4 - 6 years of experience in Big Data implementations

· Understand business requirements and convert them into solution designs

· Architecture, Design and Development of Big Data / data Lake Platform.

· Understand the functional and non-functional requirements in the solution and mentor the team with technological expertise and decisions.

· Hands-on experience in working with Hadoop Distribution platforms like HortonWorks, Cloudera, MapR and others.

· Strong hands-on experience in working with Hive, Spark (Java / Scala / Python)

· Experience in designing and building Hadoop data lake

· Produce a detailed functional design document to match customer requirements.

· Responsible for Preparation, reviewing and owning Technical design documentation.

· reviews, and preparing documents for Big Data applications according to system standards.

· Conducts peer reviews to ensure consistency, completeness and accuracy of the delivery.

· Detect, analyse, and remediate performance problems.

· Evaluates and recommends software and hardware solutions to meet user needs.

· Responsible for project support, support mentoring, and training for transition to the support team.

· Share best practices and be consultative to clients throughout duration of the project.

· Take end-to-end responsibility of the Hadoop Life Cycle in the organization

· Be the bridge between data scientists, engineers and the organizational needs.

· Do in-depth requirement analysis and exclusively choose the work platform.

· Full knowledge of Hadoop Architecture and HDFS is a must

· Working knowledge of MapReduce, HBase, Pig, MongoDb, Cassandra, Impala, Oozie , Mahout, Flume, Zookeeper/Sqoop and Hive

· In addition to above technologies , understanding of major programming/scripting

· languages like Java, Linux, PHP, Ruby, Phyton and/or R

· He or she should have experience in designing solutions for multiple large data warehouses with a good understanding of cluster and parallel architecture as well as high-scale or distributed RDBMS and/or knowledge on NoSQL platforms

· Must have minimum 3+ years hands-on experience in one of the Big Data Technologies (I.e. Apache Hadoop, HDP, Cloudera, MapR)

· MapReduce, HDFS, Hive, Hbase, Impala, Pig, Tez, Oozie, Scoop

· Hands on experience in designing and developing BI applications

· Excellent knowledge in Relational, NoSQL, Document Databases, Data Lakes and cloud storage

· Expertise in various connectors and pipelines for batch and real-time data collection/delivery

· Experience in integrating with on-premises, public/private Cloud platform

· Good knowledge in handling and implementing secure data collection/processing/delivery

· Desirable knowledge with the Hadoop components like Kafka, Spark, Solr, Atlas

· Desirable knowledge with one of the Open Source data ingestion tool like Talend, Pentaho, Apache NiFi, Spark, Kafka

· Desirable knowledge with one of the Open Source reporting tool Brit, Pentaho, JasperReport, KNIME, Google Chart API, D3