Job Description :
Data Software Engineer or ETL

Location : Philadelphia, PA

Interview : Phone and F2F or Skype

Duration : 6+ Months



JOB DESCRIPTION :

An experienced data software engineer who have been working with large-scale and distributed data pipelines. The analytics engineer will be responsible to help create our next-generation analytics platform and the responsibilities span the full engineering lifecycle from architecture and design, data analysis, software development, QA, release and operations support. The engineer will be working as a member of a dedicated DevOps team tasked with building and operating the analytics platform. The data Software Engineer will work closely with (and support) a team of data analysts/scientists.

RESPONSIBILITIES:


Create and support an analytics infrastructure to support high-volume and high-velocity data pipelines.
Analyze massive amounts of data both real-time and batch processing
Prototype ideas for new tools, products and services
Ensure a quality transition to production and solid production operation of applications
Help automate and streamline our operations and processes
Troubleshoot and resolve issues in our dev, test and production environments
Develop and test data integration components to high standards of quality and performance
Lead code reviews and act as mentor to less experienced members of the team
Assist with planning and executing releases of data pipeline components into production
Troubleshoot and resolve critical production failures in the data pipeline
Research, identify and recommend technical and operational improvements that may result in improved reliability, efficiency and maintenance of the analytics pipeline
Evaluate and advise on technical aspects of open work requests in the product backlog with the project lead


BASIC QUALIFICATIONS:


Minimum of Bachelor''s Degree in Comp. Science or related field
At least 5 years solid development experience working on Linux/Unix platform.
5 to 7 years development experience working in Java/Scala
At least 2 years of that experience should be in the analytics sphere and working with distributed compute frameworks.
Strong experience using ETL tools such as Pentaho; Hadoop ETL tech stack - Hive, Sqoop and Oozie
experience with at least 2 live projects in Scala/Spark.
Experience of working in AWS environment
Knowledge of the following technologies: Spark, Storm, Kafka, Kinesis, Avro.
Adaptable, proactive and willing to take ownership
Good communication skills, ability to analyze and clearly articulate complex issues and technologies
             

Similar Jobs you may be interested in ..