Job Description :
Hands-on experience with Hive, Spark, Impala and similar tools for SQL-like exploration of large-scale data sets.
Experience in Cloudera distribution and basic knowledge of AWS cloud computing platform.
Ability to do post-mortem if something bad happens to the production systems. Identify what went wrong and provide detailed RCA.
Perform advanced troubleshooting and monitoring of the systems to ensure SLAs are met.
Strong Programming experience in Python ,SQL, Shell Script / bash.
Hands-on experience with Linux/Unix.
Knowledge in Data Lake platform.
Knowledge in any ETL/BI tool is plus.
Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala.
Experience with NoSQL databases, such as HBase, Cassandra, MongoDB.
Knowledge of various ETL techniques and frameworks, such as Flume.
Experience with various messaging systems, such as Kafka or RabbitMQ.
Good understanding of Lambda Architecture, along with its advantages and drawbacks.
Skill set requirement - Data lake, Cloudera, Hadoop Mapreduce yarn, hdfs, Hive, Proficient in Spark, Debug yarn, flume, logs, Talend and Linux.
             

Similar Jobs you may be interested in ..