Job Description :

Description/Job Summary:

The ideal candidate will have experience in developing data ingestion and transformation ETL processes for analytical data loads, from a technical perspective.

 

Responsibilities:

• Selecting and integrating any Big Data tools and frameworks required to provide requested capabilities.

• Transition of legacy ETLs with Java and Hive queries to Spark ETLs.

• Design, develop, test and release ETL solutions including data quality validations and metrics that follow data governance and standardization best practices.

• Design, develop, test and release ETL mappings, mapplets, workflows using Streamsets, Java MapReduce, Spark and SQL.

• Performance tuning of end-to-end ETL integration processes.

• Monitoring performance and advising any necessary infrastructure changes

• Analyze and recommend optimal approach for obtaining data from diverse source systems.

• Work closely with the data architects, who maintain the data models, including data dictionaries/metadata registry.

• Interface with business stakeholders to understand requirements and offer solutions.

 

Requirements

Required Skills:

• Proficient understanding of distributed computing principles and hands on experience in Big Data Analytics and development

• Good knowledge of Hadoop and Spark ecosystems including HDFS, Hive, Spark, Yarn, MapReduce and Sqoop

• Experience in designing and developing applications in Spark using Scala that work with different file formats like Text, Sequence, Xml, parquet and Avro

• Experience of using build tools Ant, SBT Maven

• Strong SQL coding; understanding of SQL and No SQL statement optimization/tuning.

• Ability to lead designing and implementation of ETL data pipelines.

• Experience developing data quality checks and reporting to verify ETL rules and identify data anomalies.

• AWS development using big data technologies.

• Techniques for testing ETL data pipelines either manual or using tools.

• AWS cloud certified, CMS experience, Databricks and Snowflake experience a plus.


Education/Experience Level:

• Bachelor’s Degree with 5 years’ experience or 10+ years of experience in the software development field.

• 5+ years of Bigdata ETL development experience.

• 4+ years of AWS big data experience.

• 3+ years of experience developing data validation checks and quality reporting.

• 4+ years of experience tuning Spark/Java coding, SQL and No SQL.

             

Similar Jobs you may be interested in ..