Job Description :

Position:GCP Data Engineer

Location:Dearborn, MI

JD:

Design and build production data engineering solutions to deliver data pipeline patterns using following Google Cloud Platform (GCP) services:

  • In-depth understanding of Google's product technology and underlying architectures
  • BigQuery – Warehouse/ data marts – Through understanding of Big Query internals to write efficient queries for ELT needs, creation of views/materialized views, creation of reusable store procedures etc.
  • DataFlow (Apache Beam) – reusable Flex templates/ data processing frameworks using Java for both batch and stream needs.
  • Pub/Sub, Kafka, Confluent Kafka – Real time streaming of database changes or events.
  • Experience of designing, building, and deploying production-level data pipelines using Kafka; Strong experience working on Event Driven Architecture
  • Strong knowledge of the Kafka Connect framework, with experience using several connector types HTTP REST proxy, JMS, File, SFTP, JDBC etc.
  • Experience in handling huge volumes of streaming messages from Kafka
  • Cloud Composer (Apache Airflow) – to build, monitor and orchestrating the pipeline
  • Knowledge on BigTable
  • Cloud SQL, Compute Engine, Cloud Function, Cloud Run and App Engine, Cloud Storage
  • Experience with open-source distributed storage and processing utilities in the Apache Hadoop family.
  • Extensive knowledge on processing various file formats orc, Avro, csv, json, xml etc.
  • Knowledge/experience in any ETL tools like DataStage/Informatica – Ability to understand existing on-premises ETL workflows and redesign them in GCP.
  • Experience and expertise on Terraform to deploy the GCP’s in CI/CD.
  • Knowledge/ Experience on connecting to on-prem API’s from google cloud.


Client : TECH MAHINDRA

             

Similar Jobs you may be interested in ..