Job Description :
Responsibilities:
Design and implement distributed data processing pipelines using Spark, Hive, Python, and other tools and languages prevalent in the Hadoop ecosystem. Ability to design and implement end to end solution.
Experience publishing RESTful API’s to enable real-time data consumption using OpenAPI specifications
Experience with open source NOSQL technologies such as HBase, DynamoDB, Cassandra
Familiar with Distributed Stream Processing frameworks for Fast & Big Data like ApacheSpark, Flink, Kafka stream
Build utilities, user defined functions, and frameworks to better enable data flow patterns.
Work with architecture/engineering leads and other teams to ensure quality solutions are implements, and engineering best practices are defined and adhered to.
Experience in Business Rule management systems like Drools
Qualification:
MS/BS degree in a computer science or related discipline
6+ years’ experience in large-scale software development
3 year experience in Big Data technologies
Strong programming skills in Java/Scala, Python, Shell scripting, and SQL
Strong development skills around Spark, MapReduce, and Hive
Strong skills around developing RESTful API’s