Job Description :
Big Data
Chicago, IL
Long Term

Client looking candidate with Scala & Spark.


Setting up and maintaining big data infrastructure

Selecting and integrating any Big Data tools and frameworks required to provide requested capabilities

Implementing ETL process for existing projects where importing of data from various data sources is required

Designing and advising any necessary infrastructure changes

Working with teams to implement solutions, help troubleshooting, and/or moving existing projects on big data

Skills Required :

Proficient understanding of distributed computing principles

Management of Hadoop cluster, with all included services

Ability to solve any ongoing issues with operating the cluster

Proficiency with Hadoop v2, MapReduce, HDFS along with Python language

Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala

Experience with Spark

Experience with integration of data from multiple data sources

Experience with NoSQL databases, such as HBase, Cassandra, MongoDB

Knowledge of various ETL techniques and frameworks, such as Flume

Experience with Cloudera/MapR/Hortonworks distributions

Experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O