Job Description :
Build / maintain data pipelines for different types of data (Batch and Streaming
You’re an expert in some programming languages (Java, Python, Scala, etc
Deep understanding of Spark, SQL, tuning jobs and schema design
You have extensive experience with data quality and data-manipulation (ETL) tools that convert data into actionable information.
You have experience with AWS/Databricks/Hadoop for large data processing
Comfortable with Airflow and excellent understanding of scheduling and workflow frameworks and principles
Comfortable with Agile software development methodologies (CI/CD, test cases etc)
Publishes good documentation to help people use the information you provide.
You pay meticulous attention to end-to-end data quality, validation, and consistency.