Big Data Consultants

Dallas, TX Dallas TX 75201

Date : Dec-07-21

Dallas, TX

Dec-07-21

Work Authorization

US Citizen
GC
H1B
OPT EAD, GC EAD, L2 EAD, H4 EAD, TN EAD

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
1099-Contract
Contract to Hire

Job Details

Experience

Midlevel

Rate/Salary ($)

Market

Duration

Sp. Area

BigData, NoSQL

Sp. Skills

BigData

Permanent Direct Hire

Direct Client Requirement

Required Skills :

Big Data, park, Spark Core, PySpark, Python, Scala, HDFS, Hadoop, Kafka, Data Ingestion, Data Quality, Machine Learning, ETL

Preferred Skills :

Domain :

IT/Software

Work Authorization

US Citizen
GC
OPT EAD, GC EAD, L2 EAD, H4 EAD, TN EAD
H1B

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
1099-Contract
Contract to Hire

Job Details

Experience

Midlevel

Rate/Salary ($)

Market

Duration

Sp. Area

BigData, NoSQL

Sp. Skills

BigData

Permanent Direct Hire

Direct Client Requirement

Required Skills :

Big Data, park, Spark Core, PySpark, Python, Scala, HDFS, Hadoop, Kafka, Data Ingestion, Data Quality, Machine Learning, ETL

Preferred Skills :

Domain : IT/Software

Maveric Systems
Princeton, NJ
Post Resume to
View Contact Details &
Apply for Job

Job Description :

Job Title: Big Data Consultant (PySpark or Spark-Scala)

Location:Dallas

Type:Permanent

Experience:4-7 years (relevant)

Notice: Immediate

Job Description:

Must Have: Spark, Spark Core, PySpark, Python, Scala, HDFS, Hadoop, Kafka, Data Ingestion, Data Quality

Education: Minimum Bachelor’s degree in Computer Science, Engineering, Business Information Systems, or related field. Masters in Computing related to scalable and distributed computing is a major plus

Key Responsibilities:

Develop Big Data applications using PySpark or Scala-Spark on Hadoop, Hive and/or Kafka, HBase, MongoDB
Build Feature Engineering, Scoring / Machine Learning models
Deployment on Cloud platforms

Experience & Skillset: MUST-HAVE

Total IT / development experience of 7+ years
Experience in PySpark or Spark-Scala developing Big Data applications on Hadoop, Hive and/or Kafka, HBase, MongoDB
Technical Design and Onsite-Offshore coordination
Deep knowledge of Spark libraries on Python or Scala to develop and debug complex data engineering challenges
Experience in developing sustainable data driven solutions with current new generation data technologies to drive our business and technology strategies
Exposure in deploying on Cloud platforms
At least 3 years of development experience on designing and developing Data Pipelines for Data Ingestion or Transformation using PySpark or Spark-Scala
At least 4 years of development experience in the following Big Data frameworks: File Format (Parquet, AVRO, ORC), Resource Management, Distributed Processing and RDBMS
At least 4 years of developing applications in Agile with Monitoring, Build Tools, Version Control, Shell Scripting, Unit Test, TDD, CI/CD, Change Management to support DevOps
Prior experience on ETL or SQL or other Data technologies

GOOD-TO-HAVE

Banking domain knowledge
Hands-on experience in SAS toolset / statistical modelling migrating to Machine Learning models
Digital Marketing Machine Learning models and use cases
ETL / Data Warehousing and Data Modelling experience prior to Big Data experience
Deep knowledge on AWS stack for big data and machine learning