Job Description :
10+ Years of experience in building platform, linux administration/database administration or Strong programming experience with Java / Python
3+ Years with Hadoop Ecosystem including Spark, Hbase, Kafka, Sentry, Sqoop, flume, oozie, Jupyter Notbook, Zeppelin
Experience of taking Spark to production and running production workloads is a must
Any production experience of running Spark on containers
Expertize in productionizing spark based applications in big data, Hadoop and cloud environments
Expertize in using Spark with big data processing and analytics use cases in production
Performance tuning of spark clusters and optimizing the configuration
Integration of Jupyter Notbook, Zeppelin etc., with Spark and other environments
Experience with Interoperability of various components in the Ecosystem
Experience with architecture and enterprise deployments for Big data/Spark based distributed environments
Experience with Containers and Kubernetes. Especially Spark setup with AWS EC2 and Kubernetes containers.
Expertize with Linux OS / RHEL
Batch Processing using Apache Spark, EMR, MapR and ability to recommend the right pattern for use case
Stream processing - Spark streaming, Apache Storm, Flink ,Kafka etc.,
Python / Unix Shell scripting
Ability to do capacity sizing with Hadoop and Spark based Cluster