Job Description :
Data Engineer/AWS Developer – Northern Virginia – Long term Consultant
Develop data filtering, transformational and loading requirements
Define and execute ETLs using Apache Sparks on Hadoop among other Data technologies
Determine appropriate translations and validations between source data and target databases
Implement business logic to cleanse & transform data
Design and implement appropriate error handling procedures
Develop project, documentation and storage standards in conjunction with data architects
Monitor performance, troubleshoot and tune ETL processes as appropriate using tools like in the AWS ecosystem.
Create and automate ETL mappings to consume loan level data source applications to target applications
Execution of end to end implementation of underlying data ingestion workflow.
Required Skills
At least 5 years of experience developing in Java, Python
Bachelor’s degree with equivalent work experience in statistics, data science or a related field.
Experience working with different Databases and understanding of data concepts (including data warehousing, data lake patterns, structured and unstructured data)
3+ years’ experience of Data Storage/Hadoop platform implementation, including 3+ years of hands-on experience in implementation and performance tuning Hadoop/Spark implementations.
Implementation and tuning experience specifically using Amazon Elastic Map Reduce (EMR
Implementing AWS services in a variety of distributed computing, enterprise environments.
Experience writing automated unit, integration, regression, performance and acceptance tests
Solid understanding of software design principles
Preferred Skills
Understanding of Apache Hadoop and the Hadoop ecosystem. Experience with one or more relevant tools (Sqoop, Flume, Kafka, Oozie, Hue, Zookeeper, HCatalog, Solr, Avro
Deep knowledge on Extract, Transform, Load (ETL) and distributed processing techniques such as Map-Reduce
Experience with Columnar databases like Snowflake, Redshift
Experience in building and deploying applications in AWS (EC2, S3, Hive, Glue, EMR, RDS, ELB, Lambda, etc
Experience with building production web services
Experience with cloud computing and storage services
Knowledge of Mortgage industry