Job Description :
Job Title : Spark Scala Developer with Data Warehouse
Location : Rochelle Park(NJ), Irving(TX), Alpharetta(GA)
Job Description :
Understanding of database concepts, data warehouse and data lake
technologies (Teradata, Aster, Hadoop
Understanding of data management infrastructure and IT system hardware.
Experience in Designing and developing end-to-end solutions in a large scale
data warehousing environment.
Experience in data modeling, programming, data mining, large scale data
acquisition, transformation and cleaning of structured and unsecured data.
Subject matter expertise/knowledge in data integration, data quality, and
database architecture and data governance.
Knowledge of Hadoop ecosystem or Teradata platform with experience in
related tools (Pig, Spark, Scoop, Oozie, HDFS, Bteq, FastExport, MLOAD,
Extensive Experience in: SQL, Hive, Sqoop, OOZIE, Unix/Linux KSH, deploying
PySpark or Scala models in production Hadoop cluster.
Experience in large scale, complex ETL techniques and scripting.
Good to Have :
Experience in Project, source code management and trouble reporting tools
(JIRA, Git, Slack)
Knowledge of Data Science Life Cycle and machine learning techniques
Experience in designing, developing and deploying automated data pipelines
that leverage machine
learning models.
Experience in Open systems and Cloud-based applications like AWS (S3, EC2)
Experience with programming languages (PySpark, Python)
Experience with machine learning tools (DataRobot)
Experience with virtual machine processing (Dockers/Kubernetes Containers)