Job Details
Position: Google Cloud Data Migration Data Migration Team Lead
Duration: Long term contract
Job Summary:
Seeking a Google Cloud data engineer to design, build, and maintain scalable and efficient data processing systems on the Google Cloud platform. This engineer will be responsible for the entire data lifecycle, from ingestion and storage to processing, transformation, and analysis. Their work will enable client organizations to make data-driven decisions by providing clean, high-quality data to business intelligence tools, AI systems, analysts and data scientists.
Key Responsibilities:
- Serve as Data Migration team leader for a large data and application migration to the Google Cloud platform.
- As Data Migration team leader, this individual will be responsible for our team's endto-end data architecture and migration planning to support the migration effort as well as future-state client efforts on the Google Cloud platform.
- As Data Migration team leader, this individual will collaborate closely with the overall Google Cloud migration team leadership, working to deliver a successful application and data migration.
- Design and build data pipelines: Develop and maintain reliable and scalable batch and real-time data pipelines using Google Cloud Platform tools such as Cloud Dataflow (based on Apache Beam), Cloud Pub/Sub, and Cloud Composer (for Apache Airflow).
- Create and manage data storage solutions: Implement data warehousing and data lake solutions using Google Cloud Platform products like BigQuery, Cloud Storage, and other transactional or NoSQL databases such as CloudSQL or Bigtable.
- Ensure data quality and integrity: Develop and enforce procedures for data governance, quality control, and validation throughout the data pipeline to ensure data is accurate and reliable.
- Optimize performance and cost: Monitor data infrastructure and pipelines to identify and resolve performance bottlenecks, ensuring that all data solutions are cost-effective and scalable.
- Collaborate with other teams: Work closely with data scientists, analysts, and business stakeholders to gather requirements and understand data needs, translating them into technical specifications.
- Automate and orchestrate workflows: Automate data processes and manage complex workflows using tools like Cloud Composer.
- Implement security: Design and enforce data security and access controls using Google Cloud Platform Identity and Access Management (IAM) and other best practices.
- Maintain documentation: Create and maintain clear documentation for data pipelines, architecture, and operational procedures.
Required Skills & Qualifications:
- Google Cloud Platform Certified professional
- 8+ years of data engineering experience developing large data pipelines in very complex environments
- Very Strong SQL skills and ability to build very complex transformation data pipelines using custom ETL framework in Google BigQuery environment
- Very strong understanding of data migration methods and tooling, with hands-on experience in at least three (3) data migrations to Google Cloud
Google Cloud Platform: Hands-on experience with key Google Cloud Platform data services is essential, including:
- BigQuery: For data warehousing and analytics.
- Cloud Dataflow: For building and managing data pipelines.
- Cloud Storage: For storing large volumes of data.
- Cloud Composer: For orchestrating workflows.
- Cloud Pub/Sub: For real-time messaging and event ingestion.
- DataProc: For running Apache Spark and other open-source frameworks.
- Programming languages: Strong proficiency in programming languages, most commonly Python, is mandatory. Experience with Java or Scala is also preferred.
- SQL expertise: Advanced SQL skills for data analysis, transformation, and optimization within BigQuery and other databases.
- ETL/ELT: Deep knowledge of Extract, Transform, Load (ETL) and Extract, Load, Transform (ELT) processes.
- Infrastructure as Code (IaC): Experience with tools like Terraform for deploying and managing cloud infrastructure.
- CI/CD: Familiarity with continuous integration and continuous deployment (CI/CD) pipelines using tools such as GitHub Actions or Jenkins.
- Data modeling: Understanding of data modeling, data warehousing, and data lake concepts
We are an equal opportunity employer. All aspects of employment including the decision to hire, promote, discipline, or discharge, will be based on merit, competence, performance, and business needs. We do not discriminate on the basis of race, color, religion, marital status, age, national origin, ancestry, physical or mental disability, medical condition, pregnancy, genetic information, gender, sexual orientation, gender identity or expression, national origin, citizenship/ immigration status, veteran status, or any other status protected under federal, state, or local law.