Job Description :
Description of the position:
Work with business users to understand use cases and explore tools and technologies to meet the needs
Experiment with the tools, run POCs, compare various competing offerings in the market to identify
Engage the vendors to perform POCs - install tools and self service experimentation
Define criteria to compare and shortlist tools for a particular business need
Work in cross-disciplinary teams to understand ways to ingest rich data sources such as social media, news, internal/external documents, emails, financial data, and operational data
Understand and define process for how the selected tools / technologies will be implemented, operationalized and monitored for the business users
Provide technical consulting, design and coding/ prototyping for Hadoop Platform activities
Rapidly architect, design, prototype, and implement architectures to tackle the Big Data and Data Science needs
Research, experiment, and utilize leading Big Data methodologies, such as Hadoop, Spark, Kafka, Kinesis, Redshift, Microsoft Azure, AWS etc.
Implement and test data processing pipelines, and data mining / data science algorithms on a variety of hosted settings, such as AWS, Azure, and on-premise clusters
Implement automation to cut down time on manual processes
Work with the team to build insightful visualizations, reports, and presentations

Qualifications:
Bachelor’s degree from an accredited college/university in Computer Science, Computer Engineering, or related field and minimum four years of big data experience with multiple programming languages and technologies
2-3 years of development experience in Java and Spark on a distributed platform.
Fluency in several programming languages such as Python, Scala, or Java, with the ability to pick up new languages and technologies quickly
Understanding of cloud and distributed systems principles, including load balancing, networks, scaling, in-memory vs. disk, etc.;
Hands on experience with large-scale, big data methods, such as MapReduce, Hadoop, Spark, Hive, Hue,Impala, or Storm
2-3 years of professional experience working with Hadoop stack, preferably with Cloudera CDH.
Solid understanding of automation tools like Jenkins
Knowledge of best practices related to data security.
Hands on experience using Cassandra, HIVE, No-SQL databases ( like Hbase, MongoDB), SQL databases (like Oracle, SQL, Postgres , MySQL server, etc
Ability to work efficiently under Unix/Linux environment , with experience with source code management systems like GIT and SVN and Unix Shell Scripting
Ability to work with team members and clients to assess needs, provide assistance, and resolve problems, using excellent problem-solving skills, verbal/written communication, and the ability to explain technical concepts to business people


Client : Direct Client