Data Engineer
Position Summary
The Data Engineer is responsible for building and maintaining scalable data pipelines, data warehousing solutions, and data platforms that support analytics, machine learning, and business intelligence. This role focuses on data integration, ETL/ELT workflows, and ensuring data quality and availability.
Key Responsibilities
Develop, maintain, and optimize data pipelines (batch and streaming).Build and manage data lakes, data warehouses, and analytics platforms.Design ETL/ELT workflows and automate data ingestion from multiple sources.Work with big data tools and distributed processing frameworks.Ensure data quality, validation, governance, and lineage.Collaborate with data scientists, analysts, and product teams.Optimize data storage, query performance, and processing efficiency.Develop APIs, dashboards, or interfaces for data access.Manage cloud-based data infrastructure (AWS, Azure, GCP).Document architecture, workflows, and data models.
Required Skills & Experience
Strong experience with SQL, relational databases, and NoSQL systems.Proficiency in Python, Scala, or Java for data engineering tasks.Experience with big data tools (Spark, Hadoop, Kafka, Flink).Knowledge of ETL/ELT tools (Airflow, dbt, Glue, Informatica).Familiarity with data warehousing (Snowflake, BigQuery, Redshift).Understanding of data modeling, normalization, and data governance.Experience with Git, CI/CD, and cloud platforms (AWS/GCP/Azure).
Preferred Qualifications
Experience with streaming technologies (Kafka, Kinesis, Pub/Sub).Knowledge of machine learning pipelines (MLflow, Feature Store).Background in analytics, BI tools, or dashboarding.Certification in AWS, Azure, or GCP data engineering.