Job Description :
Position: Data Engineering Lead

Location: Manhattan, NY

Duration: Full Time

ü Directing a team of engineers to design, develop and implement data pipelines for business operational support, analytical workflows and machine intelligence opportunities.

ü Identifying the right data sources, establishing data modeling best practices, and installing/administering or designing a data modeling tools.

ü Analyzing existing datasets creating requirements for incorporation into data flow pipelines as well logical and physical data models.

ü Establishing a continuous process for optimization by developing and enforcing data modeling best practices.

ü Facilitating the communication of the architecture, design and project status to project stakeholders.

ü Providing business insight through integration of data with available machine learning and business intelligence tools.

ü Optimizing workflows so that they might be migrated into (and out of) public cloud infrastructure where sensible.


A Bachelors’ degree in Information Systems, or equivalent experience; Master’s Degree in Computer Science or equivalent experience.

Experience with diverse data warehousing and persistence platforms including classic data warehouses, MPP systems and the Hadoop ecosystem.

Experience with cleaning, aggregating, and pre-processing data from various source, both structured and unstructured.

Familiarity with ETL tools and contemporary data pipeline architectures, including migrating data to cloud-hosted environments.

Experience creating complex SQL queries for analytics modeling and ad-hoc analysis. Proven ability to review query plans and perform debugging and optimization.

Demonstrated experience operating in a Unix/Linux environment.

Expertise in scripting and object-oriented development in either Java, Python or Scala.

Solid understanding of data warehouse modeling and data structures and algorithms use for data engineering.

Experience moving large datasets in and out of the Hadoop ecosystem as well as the use of standard Hadoop data processing tools.

Coursework or practical experience with machine learning, data mining, data quality and analytical model development.

Excellent problem solving, critical thinking and communication skills.

Strong demonstrated ability to evaluate and adopt new technologies in a short time.

Demonstrated ability to work with minimal direction, with the proven ability to coordinate and manage complex activities.

Additional skills and requirements:

Demonstrated experience with providing advanced data insights in a retail, e-commerce, and/or marketing functions.

Strong proficiency writing, debugging and optimizing code, preferably with machine learning projects (e.g. recommendation, optimization, and classification

Financial fundamentals within business intelligence, especially as applied to the retail domain.

Experience with common analytical support and Business Intelligence tools (MicroStrategy, SSRS/SSAS, Tableau)

Demonstrated proficiency in multiple database platforms including both relational and non-relational/NoSQL

Demonstrated proficiency in Hadoop, Spark, Storm or related paradigms and associated languages such as Pig, Hive, and Mahout.

Experience in AWS, Azure, Google Cloud or other cloud ecosystems.

Experience in Information Retrieval libraries and platforms (e.g. Elastic, Solr, Lucene)

Proven ability to communicate, verbally and written, with technical and non-technical resources.

Skills to work and manage in a dynamic development in an agile fashion.

Demonstrated ability to plan tasks and coordinate the efforts of an engineering team.