ETL Developer

Franklin Park, NJ Franklin Park NJ 08823

Date : May-02-21

ETL Developer

Franklin Park, NJ

May-02-21

Work Authorization

US Citizen
GC
H1B
GC EAD

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
1099-Contract
Contract to Hire

Job Details

Experience

Midlevel

Rate/Salary ($)

DOE

Duration

Long term

Sp. Area

BigData, NoSQL

Sp. Skills

x-Other

Consulting / Contract

Direct Client Requirement

Required Skills :

AWS, ETL, Hadoop, PYTHON, Spark, Talend

Preferred Skills :

Domain :

IT/Software

Work Authorization

US Citizen
GC
GC EAD
H1B

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
1099-Contract
Contract to Hire

Job Details

Experience

Midlevel

Rate/Salary ($)

DOE

Duration

Long term

Sp. Area

BigData, NoSQL

Sp. Skills

x-Other

Consulting / Contract

Direct Client Requirement

Required Skills :

AWS, ETL, Hadoop, PYTHON, Spark, Talend

Preferred Skills :

Domain : IT/Software

TechnoGen Inc
Chantilly, VA
Post Resume to
View Contact Details &
Apply for Job

Job Description :

TECHNOGEN, Inc. is a Proven Leader in providing full IT Services, Software Development and Solutions for 15 years.

TECHNOGEN is a Small & Woman Owned Minority Business with GSA Advantage Certification. We have offices in VA; MD & Offshore development centers in India. We have successfully executed 100+ projects for clients ranging from small business and non-profits to Fortune 50 companies and federal, state and local agencies.

Job Title: ETL Developer

Work Location (s): Titusville, NJ/Franklin Park, NJ

Duration: Long term

Project Description:

This project is Revamp/Re-Design existing process to achieve following
A new Flexible framework system with better performance.
Integrate multiple datasets into one data model.
Make availability of Project most recent data to all Business users and downstream applications.

Future Phase: Design and Integrate Rule-Engine based Architecture (in ADAL, Authoritative Data Access Layer) into Client New Arch Framework which enables End-users/Business users to access Rule based data (a Self-Service Portal) for Client data.

Job Duties:

Capture business rules and requirements, design the associated conceptual, logical, and physical models and create the required documentation to support, communicate and validate the data models.
Develop and maintain the Data lineage for all the business user's usage related Entities.
Migrate the existing environment (that's running on Informatica, Oracle & Linux) to Hadoop eco-system on Amazon Web Services (AWS) Cloud Stack components
Develop scripts that store the source system files in Hadoop distributed file systems.
Perform modelling of Hadoop environment and build Data standards for Data ingestion, Data processing and Data visualization layers for various big data analytic projects.
Maintain the Hadoop environment related code in distributed version controlling tools such as GitHub and Bitbucket.
Develop and execute pyspark (Python Spark) scripts for Massive Parallel Processing (MPP) of huge data volumes.
Migrate (for specific projects) Apache Hive tables into RedShift and develop views that will be used by business teams for business analytics.
Analyze and enhance the batch jobs that run on TALEND for performance improvements.
Based on the business requirements/ Functional requirements, develop technical specification documents.
Perform source and target data analysis and provide Gap analysis reports that will be used for integration and enhancements projects.
Develop python scripts to pull data from opensource through API's or AWS Data Exchange
Migrate complete batch jobs that run on TIDAL into Control-M
Develop various unit test cases and integration test cases that will be used for technical teams and business teams for projects to ensure the appropriateness of the deliverables/projects.
Identify and develop performance improvement methods on various technical ETL and business processes that run-in production by tuning HQL (Hive Query Language) queries, Spark SQL's, Spark Submit, SQL queries, Redshift views.
Spin-up EMR cluster for processing the data load, install all required libraries, archive the data in s3 before terminating the cluster & spin-off the cluster after successful completion of all loads
Develop Automation of EC2 instance Scale Up during peak processing time and scale down after completion of processes
Transfer bulk data between Apache Hadoop & Relational databases using SQOOP.
Provide Estimations to project management team on the project deliverables.
Develop project deliverable work tasks and delegate to Offshore developers.
Manage Offshore development team in terms of their daily work activities and deliverables.
Conduct Technical walk through sessions and perform overall monitoring of the deliverables progress as a team.

Degree Requirement: Bachelor's degree in computer science, computer information systems, information technology, or a combination of education and experience equating to the U.S. equivalent of a Bachelor's degree in one of the aforementioned subjects.