Job Description :






We are looking for an experienced Data Engineer who will be responsible for standardizing our data ingestion pipelines and improving performance of data storage and retrieval. Our projects ingest structured and unstructured data from multiple sources, that eventually ends up in a unified Data Lake. This data is available for reporting and analytics by various other teams. This candidate should be comfortable working with SQL as well as NoSQL databases. Cloud experience is a bonus.

The candidates primary focus will be the continuous development/improvement of our data ingestion pipeline, and data parsing. Major technologies involved include SQL, MongoDB, ElasticSearch, NiFi, Python, Hadoop, Spark, and AWS.

Skills and Responsibilities

Support existing and develop new data flows as needed by developing processes that verify, standardize, and scale data input, transformation and storage.
Ability to write intermediate to advanced SQL/Python for data ingestion and processing.
Developing techniques to work with structured (tabular/hierarchical) & unstructured data.
Port legacy ETL scripts in NIFI/Pentaho to the cloud
Research and implement cutting edge solutions to solve challenges related to ETL, data processing, and analytics.
Support efforts by the Data Science and Data Analytics teams.
Ability to debug complex data issues without frequent guidance from senior team members.
Occasional Linux server management including the review or management of log files, crontab, security configuration, etc.
Collaborate with product managers and other engineers to implement and document complex and evolving requirements
Strong attention to detail, good work ethic, ability to work on multiple projects simultaneously, and good communication skills
Required Skills

3 to 6 years of experience working in Data Engineering or related roles.
A degree in Computer Science or an equivalent major.
1+ years of experience using NIFI/Pentaho
Experience working with Elastic Search
Experience working with various database systems (including SQL, NoSQL, etc
Experience with agile methodologies and short release cycles
Experience programming in various languages, especially Python, and SQL.
Enjoys collaborating with other engineers on architecture and sharing designs with the team
Desired Experience

Hands on experience with the Hadoop ecosystem.
Experience with cloud technologies (AWS)
Experience with MongoDB a plus
Distributed System Development for large-scale applications
Experience working with PHI/Healthcare data.
AI / ML experience.