Job Description :

Job Title: AWS Data Engineer

Location: Jersey City, NJ

Job Type: Full Time


We are looking for a Data Engineer that will help us build data pipelines and all data related work on AWS to design, develop, test and integrate a critical security related application. The application tracks user data during their daily business activities and their attempts to access various business applications during the day. Based on their access privileges and data entitlements, the users are either permitted access to systems/data or blocked by an automated system. All exceptions and valid access patterns are then stored on AWS system. These usage data are then used to develop users profiles and temple for system access based on their roles within the Bank.

Your primary focus will be in integrating the SoRs and extracting the data feeds from the current input systems. Then ship this data to AWS. On AWS build the data pipelines using tools such as DMS, Glue, EMR, Spark, Hive, S3, Python and RDS services. Also experience in complex data ingestion in a very systematic way by doing data registration, format matching, Serde, metadata preservation and management is required. Experience working on building data pipelines that provide complete metadata, handles data quality issues, error management for the ETL pipelines, build and preserve data lineage and data linkages is also a plus.

Role Description:

Develop ETL/Data Integration pipelines among various source systems and AWS Data Lake
Use ETL Tools such as Data migration services, Glue, Spark/Python/Scala to design, develop, test and implement data pipelines that are developed as part of the application functionality
Design new data flows, maps, plans and implement them on AWS
Manage and maintain existing and new business rules using ETL or rules engine and test and integrate them in the data pipelines
Work with source side teams in understanding data models, and source data requirements to create staging and final Data Lake Data models and create HLDs, LLDs for the required data models
Use SQL skills to query data and understand the data relationships. Also use ad-hoc querying skills to understand the data flows, data transformations and data reconciliation, validation
Test the data pipelines in development and QA environment
Consult and work with multiple teams on daily basis to uncover business needs, data integration and validation requirements.
Exposure to AWS Sage Maker is preferable and knowledge of ML model development using Python scripts and libraries is a must.


Any Graduate degree but Computer Science preferred

Desired Skills and Competencies:

7 - 12 years of solid hands-on experience in ETL and data warehousing space
Must have recent 3 to 5 years of hands-on experience working on AWS and other cloud environments.
Hands on experience on Big Data technologies such as Hadoop, Spark, Sqoop, Hive, Atlas with knowledge in developing UNIX and python scripts

Strong experience in Data Integration and ETL/ECTL/ELT techniques
Must have hands-on experience in building data models for Data Lakes, EDWs and Data Marts using 3NF, De-normalized data models, Dimensional models (Star, Snowflake, Constellations, etc
Should have strong technical experience in Design (Mapping specifications, HLD, LLD), Development (Coding, Unit testing) using big data technologies
Should have experience with SQL database programming, SQL performance tuning, relational model analysis.
Must have experience in Python and its ML libraries to be able to write data feeds, ML validation and testing scripts for ML based predictive models.
Should be able to provide oversight and technical guidance for developers on ETL and data pipelines
Must have good communications skills and should be able to lead meetings, technical discussions and escalation calls. Good documentation skills are also required.