Job Description :
Big Data
Chicago, IL
Long Term
Client looking candidate with Scala & Spark.
Responsibilities
Setting up and maintaining big data infrastructure
Selecting and integrating any Big Data tools and frameworks required to provide requested capabilities
Implementing ETL process for existing projects where importing of data from various data sources is required
Designing and advising any necessary infrastructure changes
Working with teams to implement solutions, help troubleshooting, and/or moving existing projects on big data
platforms
Skills Required :
Proficient understanding of distributed computing principles
Management of Hadoop cluster, with all included services
Ability to solve any ongoing issues with operating the cluster
Proficiency with Hadoop v2, MapReduce, HDFS along with Python language
Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala
Experience with Spark
Experience with integration of data from multiple data sources
Experience with NoSQL databases, such as HBase, Cassandra, MongoDB
Knowledge of various ETL techniques and frameworks, such as Flume
Experience with Cloudera/MapR/Hortonworks distributions
Experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O