Job Description :
Description:


We are looking for a skilled ETL developer to join our cloud solution team.
The ideal candidate has an eye for building and optimizing the ETL system and will work closely with our systems architects and data engineers to help extract, transform, aggregate and store the data within the data pipeline and ensure consistency of data delivery and utilization of reporting and data analytic systems.
This is a contract position located in our Northern Suburbs of Chicago office.




Qualifications:


Bachelor’s degree in computer science, Information Systems or equivalent quantitative field and 5+ years of experience in a similar ETL role.
Experience working with and extracting value from large, disconnected and/or unstructured datasets
Demonstrated ability to build processes that support data transformation, data structures, metadata, dependency and workload management
Strong interpersonal skills and ability to project manage and work with cross-functional teams
Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases, especially SQL Server and Hive.
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Major Hadoop ecosystem distributions such as HDP, Cloudera etc. HDP is preferred
Public cloud such as Azure, AWS etc. Azure is preferred
JSon document processing
Apache Hive and HBase, Microsoft SQL Server
Apache NiFi, Kafka
Object-oriented/object function scripting languages such as Python, Java etc




Responsibilities:


Work closely with system architects and data engineer to create and optimize the architecture of ETL system
Design and build the ETL tools to extract, transform, aggregate and store the data
Always angle for greater efficiency and robustness of the ETL process to support big volume of data flow among different data systems