Job Description :
Big Data Engineer

Top Skills Details
1) 4+ Years hands on experience in big data environment :
- Kafka, hive, impala, kudu, hue, spark, python
2) Experience with Tool: stream sets or kafka (or other tools), change data capture (CDC) or ETL
3) Hands on in Building Data ops/Data pipeline

Description
Looking for a heads down Hadoop Developer to Assist primarily in developing data pipelines, with some platform engineer work as well. This person should have around 4+ years of Hadoop engineering experience working in the big data environment: spark, hive, impala

They have been investing greatly into their Data Platform modernization. This role will sit within and enterprise data group, helping to build data pipelines for the finance group to build Management Reports, pulling the data from their hadoop Platform.

Will build data pipelines using either stream sets or kafka (depending on what the specific project need calls for), will also work with change data capture (CDC) and ETL - so they should have hands on experience with data ops/Data pipeline
Enviorment:
- On-prem - Cloudera (leveraging phdata for manage service provider), they are currently transitioning from the build to managed service phase. They are now moving into use case development using the platform.
- Streamsets for ingestion (near-real time for change data capture, and cleansing and minor transformation of data as it comes in to kafka)
- Hive, impala, spark (all the open sourced tools) Kafka
- Warescape for data modeling and automation for DW for BI
- Data vault methodology
- Atscale - used for semantic layer for BI
- For the Data science teams they are levering Cloudera workbench – which provides the scientists for prototyping and workbooking (long team they are hoping to institute a new process for reapable production ready models)