BigData ETL Developer Data Engineer

Columbia, MD Columbia MD 21046

Date : Dec-03-21

Columbia, MD

Dec-03-21

Work Authorization

US Citizen
GC
H1B
GC EAD

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
1099-Contract
Contract to Hire

Job Details

Experience

Architect

Rate/Salary ($)

Market

Duration

1 year

Sp. Area

Data Warehousing/ETL

Sp. Skills

Data Engineer

Permanent Direct Hire

Consulting / Contract

Required Skills :

ETL, Big Data, SQL, JAVA, NoSQL, Content Management System, Data Engineer, Hadoop, Performance Tuning, Scala, XML

Preferred Skills :

Big Data, Spark ETL, Hadoop, Spark ecosystems HDFS, Hive, Spark, Yarn, MapReduce, Sqoop

Domain :

IT/Software, Financial, Government, HealthCare, Retail, Insurance, Pharmaceuticals, Manufacturing

Work Authorization

US Citizen
GC
GC EAD
H1B

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
1099-Contract
Contract to Hire

Job Details

Experience

Architect

Rate/Salary ($)

Market

Duration

1 year

Sp. Area

Data Warehousing/ETL

Sp. Skills

Data Engineer

Permanent Direct Hire

Consulting / Contract

Required Skills :

ETL, Big Data, SQL, JAVA, NoSQL, Content Management System, Data Engineer, Hadoop, Performance Tuning, Scala, XML

Preferred Skills :

Big Data, Spark ETL, Hadoop, Spark ecosystems HDFS, Hive, Spark, Yarn, MapReduce, Sqoop

Domain : IT/Software, Financial, Government, HealthCare, Retail, Insurance, Pharmaceuticals, Manufacturing

Horizon Solutions llc
Folsom, CA
Post Resume to
View Contact Details &
Apply for Job

Job Description :

Description/Job Summary:

The ideal candidate will have experience in developing data ingestion and transformation ETL processes for analytical data loads, from a technical perspective.

Responsibilities:

• Selecting and integrating any Big Data tools and frameworks required to provide requested capabilities.

• Transition of legacy ETLs with Java and Hive queries to Spark ETLs.

• Design, develop, test and release ETL solutions including data quality validations and metrics that follow data governance and standardization best practices.

• Design, develop, test and release ETL mappings, mapplets, workflows using Streamsets, Java MapReduce, Spark and SQL.

• Performance tuning of end-to-end ETL integration processes.

• Monitoring performance and advising any necessary infrastructure changes

• Analyze and recommend optimal approach for obtaining data from diverse source systems.

• Work closely with the data architects, who maintain the data models, including data dictionaries/metadata registry.

• Interface with business stakeholders to understand requirements and offer solutions.

Requirements

Required Skills:

• Proficient understanding of distributed computing principles and hands on experience in Big Data Analytics and development

• Good knowledge of Hadoop and Spark ecosystems including HDFS, Hive, Spark, Yarn, MapReduce and Sqoop

• Experience in designing and developing applications in Spark using Scala that work with different file formats like Text, Sequence, Xml, parquet and Avro

• Experience of using build tools Ant, SBT Maven

• Strong SQL coding; understanding of SQL and No SQL statement optimization/tuning.

• Ability to lead designing and implementation of ETL data pipelines.

• Experience developing data quality checks and reporting to verify ETL rules and identify data anomalies.

• AWS development using big data technologies.

• Techniques for testing ETL data pipelines either manual or using tools.

• AWS cloud certified, CMS experience, Databricks and Snowflake experience a plus.

Education/Experience Level:

• Bachelor’s Degree with 5 years’ experience or 10+ years of experience in the software development field.

• 5+ years of Bigdata ETL development experience.

• 4+ years of AWS big data experience.

• 3+ years of experience developing data validation checks and quality reporting.