Job Description :


Title: Sr Hadoop Engineer / Admin AWS EMR, Hive SQL tool, Security
Right to Hire: No
Professional Day Length: 8.0
Contract Duration:
Environment: Business Casual
Sponsorship Available: No
Location: New York, New York
Minimum Requirements:

Required Skillset

AWS EMR (Elastic Map Reduce)
AWS Knowledge – Instances, Security Groups, Bootstrap actions, S3 Buckets, VPC/Subnets
Hadoop Stack - HDFS, Yarn, Pig, MapReduce
Hive SQL tool
Security – Kerberos, Authentication and Authorization
Scripting Knowledge – Shell, Perl
Linux OS

Good to Have
Machine Learning Packages – R and Python
Exposure to Puppet, Bit bucket, Jenkins
Control M scheduling tool

Responsible for
Setup, administration, monitoring, tuning, optimizing, governing Hadoop Cluster and Hadoop components
Design & implement new components and various emerging technologies in Hadoop Echo System, and successful execution of various Proof-Of-Technology (PoT)
Design and implement high availability options for critical component like Kerberos, Ranger, Ambari, Resource Manager, MySQL repositories.
Collaborate with various cross functional teams: infrastructure, network, database, and application for various activities: deployment new hardware/software, environment, capacity uplift etc.
Work with various teams to setup new Hadoop users, security and platform governance
Create and executive capacity planning strategy process for the Hadoop platform
Work on cluster maintenance as well as creation and removal of nodes using tools like Ganglia, Nagios, Cloudera Manager Enterprise, Ambari etc.
Performance tuning of Hadoop clusters and various Hadoop components and routines.
Monitor job performances, file system/disk-space management, cluster & database connectivity, log files, management of backup/security and troubleshooting various user issues.
Hadoop cluster performance monitoring and tuning, disk space management
Harden the cluster to support use cases and self-service in 24x7 model and apply advanced troubleshooting techniques to on critical, highly complex customer problems
Contribute to the evolving Hadoop architecture of our services to meet changing requirements for scaling, reliability, performance, manageability, and price.
Setup monitoring and alerts for the Hadoop cluster, creation of dashboards, alerts, and weekly status report for uptime, usage, issue, etc.
Design, implement, test and document performance benchmarking strategy for platform as well for each use cases
Act as a liaison between the Hadoop cluster administrators and the Hadoop application development team to identify and resolve issues impacting application availability, scalability, performance, and data throughput.
Research Hadoop user issues in a timely manner and follow up directly with the customer with recommendations and action plans
Work with project team members to help propagate knowledge and efficient use of Hadoop tool suite and participate in technical communications within the team to share best practices and learn about new technologies and other ecosystem applications
Automate deployment and management of Hadoop services including implementing monitoring
Drive customer communication during critical events and participate/lead various operational improvement initiatives

Senior Hadoop Engineer Job Description

The Senior Hadoop Engineer with assist in the setup and production readiness of COMPANY’s Data Lake conversion to Hadoop EMR. The candidate will work on the installation and configuration vendor and Open Source components. The candidate should have a knowledge of concepts such as LDAP integration, Kerberos, and highly available architectures.
What you need to succeed
Bachelor’s Degree in Computer Science, Information Science, Information Technology or Engineering/Related Field
3 Years Of strong Hadoop/Big Data experience.
Strong Experience on administration and management of large-scale Hadoop production clusters
Able to deploy Hadoop cluster, add and remove nodes, keep track of jobs, monitor critical parts of the cluster, configure high availability, schedule and configure and take backups.
Strong Experience with Hadoop MapReduce and HDFS
Strong Experience with Hadoop cluster management/administration/operations using Oozie, Yarn, Ambari, Zookeeper, Tez, Slider
Strong Experience with Hadoop Security & Governance using Ranger, Falcon, Kerberos, Security Concepts-Best Practices
Strong Experience with Hadoop ETL/Data Ingestion: Sqoop, Flume, Hive, Spark
Experience in Hadoop Data Consumption and Other Components: Hive, HUE, HAWQ, Madlib, Spark, Mahout, Pig
Prior working experience with AWS - any or all of EC2, S3, EBS, ELB, RDS
Experience monitoring, troubleshooting and tuning services and applications and operational expertise such as good troubleshooting skills, understanding of system’s capacity, bottlenecks, basics of memory, CPU, OS, storage, and networks.
Experience with open source configuration management and deployment tools such as Puppet or Chef and Scripting using Python/Shell/Perl/Ruby/Bash
Good understanding of distributed computing environments
Education: Bachelor’s degree or equivalent work experience

Experience: Minimum 3 years as a Hadoop Engineer or Hadoop Administrator

Looking forward for your response.