Job Description :

Onsite Interview

Data Engineer - GCP

Alpharetta - Local candidates only

***Candidates MUST work onsite 4 days per week***
Job Summary:

  • We seek a skilled Data Engineer to join our growing Data, analytics and Data products team.
  • This role is pivotal in designing, building, testing, and maintaining robust, scalable data pipelines and infrastructure on the Google Cloud Platform (GCP).
  • The successful candidate will leverage core GCP services like BigQuery and Dataflow, integrate with advanced platforms like Vertex AI, and potentially utilize Gemini models for data processing tasks, ultimately enabling data-driven decision-making and powering advanced analytics across the organization.
  • This is a unique opportunity to contribute to the modernization of data capabilities within a company vital to America's energy infrastructure.


Responsibilities:

  • Data Pipeline Development & Management: Design, develop, test, deploy, and maintain scalable, efficient, and reliable batch and real-time data pipelines on GCP.
  • Utilize services such as Dataflow, Pub/Sub, Cloud Composer, and potentially Dataproc to ingest, transform, and load data from diverse sources including on-premises systems, IoT devices, and cloud applications.
  • Data Warehousing & Modeling (BigQuery): Architect, implement, manage, and optimize data models within Google BigQuery to support analytical workloads, reporting, and data science initiatives.
  • Apply best practices for schema design, partitioning, clustering, and materialized views to ensure optimal query performance and cost-effectiveness.
  • Manage data loading (ETL/ELT) processes into BigQuery.
  • Infrastructure & Operations: Implement and manage data infrastructure using Infrastructure as Code (IaC) principles and tools (e.g., Terraform).
  • Develop and maintain CI/CD pipelines for automated testing and deployment of data workflows.
  • Monitor data pipeline health, performance, data quality, and costs, proactively identifying and implementing optimizations.
  • Data Quality & Governance: Design and implement data quality frameworks, validation checks, and monitoring processes to ensure data accuracy, completeness, and consistency.
  • Collaborate on data governance initiatives and enforce data security policies, including access controls and encryption within the GCP environment.
  • Collaboration & Support: Work closely with data scientists, analysts, product owners, and business stakeholders to gather requirements, understand data needs, and deliver effective data solutions.
  • Create and maintain clear documentation for data pipelines, models, and processes.
  • Potentially mentor junior data engineers.
  • Vertex AI & Gemini Integration: Partner with data scientists to build and manage robust data pipelines that feed into and support machine learning models and workflows deployed on Vertex AI.
  • Develop feature engineering pipelines to prepare data for ML training and prediction.
  • Explore and potentially leverage Gemini models, accessed via BigQuery ML (ML.GENERATE_TEXT) or Vertex AI APIs, for automating data cleansing, transformation, enrichment, or summarization tasks within data engineering workflows.
  • The increasing integration of AI tools into data platforms necessitates that modern Data Engineers understand how to build pipelines not just for ML, but potentially with embedded AI capabilities, demanding closer collaboration with data science and a grasp of AI/ML tooling.


Required Qualifications:

  • Education: Bachelor's degree in Computer Science, Engineering, Mathematics, Statistics, Information Systems, or a related technical field, or equivalent practical experience.
  • Experience: Minimum of 3-5 years of professional experience in data engineering, including designing, building, and operationalizing data pipelines, ETL/ELT processes, and data warehouses, with substantial experience on a major cloud platform. (Note: Senior roles may require 8+ years).
  • Soft Skills: Excellent analytical and problem-solving abilities.
  • Strong communication skills, capable of collaborating effectively with diverse teams (technical and non-technical).
  • Ability to work independently, manage priorities, and take ownership of tasks.


Preferred Qualifications:

  • Certifications: Google Certified Professional Data Engineer or Google Professional Machine Learning Engineer certification.
  • Other Technologies: Experience with distributed processing frameworks like Apache Spark; messaging systems like Kafka; workflow orchestration tools like Apache Airflow; data transformation tools like dbt or Dataform; containerization technologies like Docker and Kubernetes; CI/CD tools (e.g., Jenkins, Google Cloud Build); experience with other data platforms like Snowflake or Databricks.
  • Industry Experience: Prior experience working in the energy, utilities, or related industrial sectors.
             

Similar Jobs you may be interested in ..