Job Description :

Title: Databricks & Airflow - Lead Data Engineer

Client: Accenture Merck

Location: Location Remote, however if someone is nearby to Rahway, NJ that candidate will have to go onsite.

Duration:12+ Months (With Possible Extension)

Job Description:

**Please share Strong & Committed Candidates only, they will have stringent screening to increase chance of placement.

Must-Have:

  • Databricks expert need an experienced resource who can provide solutions and give recommendations to client.
  • Pyspark Super important
  • SQL Super important
  • Airflow good experience

Experience

  • Experience in implementation of AWS Data Lake and Data Publication using Databricks, Airflow and AWS S3.
  • Experience in Databricks Data Engineering to create Data Lake solutions using AWS services.
  • Knowledge of Databricks cluster and SQL warehouse, Experience in Delta and Parquet file handling.
  • Experience in Data Engineering and Data Pipeline creation on Databricks.
  • Experience in Data Build Tool (DBT) using Python and SQL.
  • Extensive Experience in SQL, PL/SQL, complex Join, Aggregation function and DBT, Python, Data frames and Spark.
  • Experience in Airflow for Job Orchestration, dependency Setup and Job scheduling.
  • Knowledge of Databricks Unity Catalog and Consumption patterns.
  • Knowledge of GitHub and CI/CD Pipelines, AWS Infra like IAM Role, Secrets and S3 buckets.

Role

  • Responsible for defining technical architecture and application landscape.
  • Responsible for authoring SQL and Python scripts on Databricks and DBT (Data Build tool) to create data pipelines to create Operational Data Mart.
  • Responsible for creation of Data Pipelines for Data processing of Delta files into ODM format for downstream data consumption.
  • Responsible for identifying data set relationship, join criteria and implement it in code for ODM model development.
  • Responsible for creation of Delta Lake for ODM model and setup of consumption pattern using Databricks Unity catalog.
  • Responsible for creation of Airflow DAGs for job orchestration and scheduling of data pipeline jobs.
             

Similar Jobs you may be interested in ..