Job Description :

Sr. Data Engineer
Fulltime - Direct Hire
Fully remote 

A Data Engineer will be responsible for understanding the client's technical requirements, design and build data pipelines to support the requirements. In this role, the Data Engineer, besides developing the solution, will also oversee other Engineers' development. This role requires strong verbal and written communication skills and effectively communicate with the client and internal team. A strong understanding of databases, SQL, cloud technologies, and modern data integration and orchestration tools like Google DataFlow, Informatica, and Airflow are required to succeed in this role.

Responsibilities:
• Play a critical role in the design and implementation of data platforms for the AI products
• Develop productized and parameterized data pipelines that feed AI products leveraging GPUs and CPUs.
• Develop efficient data transformation code in spark (in Python and Scala) and Dask.
• Build workflows to automate data pipeline using python and Argo.
• Develop data validation tests to assess the quality of the input data.
• Conduct performance testing and profiling of the code using a variety of tools and techniques.
• Guide Data Engineers in delivery teams to follow the best practices in deploying the data pipeline workflows.
• Build data pipeline frameworks to automate high-volume and real-time data delivery for our data hub
• Operationalize scalable data pipelines to support data science and advanced analytics
• Optimize customer data science workloads and manage cloud services costs/utilization
• Developing sustainable data driven solutions with current new generation data technologies to drive our business and technology strategies

Minimum Education:
Bachelors, Master's or Ph.D. Degree in Computer Science or Engineering or statistics.

Qualifications Minimum Work Experience (years):
5+ years of experience programming with at least one of the following languages: Python, Scala, Go.
5+ years of experience in SQL and data transformation
5+ years of experience in developing distributed systems using open-source technologies such as Spark and Dask.
5+ years of experience with relational databases or NoSQL databases running in Linux environments (MySQL, MariaDB, PostgreSQL, MongoDB, Redis

Key Skills and Competencies:
Experience working with AWS / Azure / GCP environment is highly desired.
Experience in data models in the Retail and Consumer products industry is desired.
Experience working on agile projects and understanding of agile concepts is desired.
Demonstrated ability to learn new technologies quickly and independently.
Excellent verbal and written communication skills, especially in technical communications.
Ability to work and achieve stretch goals in a very innovative and fast-paced environment.
Ability to work collaboratively in a diverse team environment.
Ability to telework.
Expected travel: Not expected.

             

Similar Jobs you may be interested in ..