Job Description :
 
Job Title: ETL Developer
Location: New Jersey
Work Schedule: Full Time, 40 Hours/Week
Experience: As per the Job Description
 
Job Duties:
Capture business rules and requirements, design the associated conceptual, logical, and physical models and create the required documentation to support, communicate and validate the data models.
Develop and maintain the Data lineage for all the business user's usage related Entities.
Perform modelling of Hadoop environment and build Data standards for Data ingestion, Data processing and Data visualization layers for various big data analytic projects.
Develop and execute pyspark (Python Spark) scripts for Massive Parallel Processing (MPP) of huge data volumes.
Based on the business requirements/ Functional requirements, developed the technical specification
documents.
Perform source and target data analysis and provide Gap analysis reports that will be used for integration and enhancements projects.
Develop python scripts to pull data from opensource through API's or AWS Data Exchange
Develop various unit test cases and integration test cases that will be used for technical teams and business teams for projects to ensure the appropriateness of the deliverables/projects.
Identify and develop performance improvement methods on various technical ETL and business processes that run-in production by tuning HQL (Hive Query Language) queries, Spark SQL's, Spark Submit, SQL queries, Redshift views.
Develop Automation of EC2 instance Scale Up during peak processing time and scale down after completion of processes.
Transfer bulk data between AWS environment & Relational databases using SQOOP.
Provide Estimations to project management team on the project deliverables.
Develop project deliverable work tasks and delegate to Offshore developers.
Manage Offshore development team in terms of their daily work activities and deliverables.
Conduct Technical walk-through sessions and perform overall monitoring of the deliverables progress as a team.
Involved in testing data result accuracy between Chat GPT 3.5 model vs 4.0 model in data analytics.
Converted all scripts from JAVA to Python 3.8 using GitHub copilot.
Worked on multiple AWS technologies like s3, lambda, data exchange, EMR, EC2, DOCKER
Executed CI/CD pipeline with Jenkins for deploying applications between environments.
 
 
             

Similar Jobs you may be interested in ..