Job Description :
Mandatory Skills :pySpark, Python, Spark, ETL / ELT architecture.

Job Description

· The Senior Data Engineer will support and provide expertise in data ingestion, wrangling, cleansing, technologies. In this role they will work with relational and unstructured data formats to create analytics-ready datasets for analytic solutions.

· The senior data engineer will partner with the Data Analytics team to understand their data needs and build data pipelines using cutting edge technologies.

· They will perform hands-on development to create, enhance and maintain data solutions enabling seamless integration and flow of data across our data ecosystem.

· These projects will include designing and developing data ingestion and processing/transformation frameworks leveraging open source tools such as Python, Spark, pySpark, etc.



Responsibilities:

· Translating data and technology requirements into our ETL / ELT architecture.

· Develop real-time and batch data ingestion and stream-analytic solutions leveraging technologies such as Kafka, Apache Spark, Java, NoSQL DBs, AWS EMR.

· Develop data driven solutions utilizing current and next generation technologies to meet evolving business needs.

· Develop custom cloud-based data pipeline.

· Provide support for deployed data applications and analytical models by identifying data problems and guiding issue resolution with partner data engineers and source data providers.

· Provide subject matter expertise in the analysis, preparation of specifications and plans for the development of data processes.



Qualifications:

· Strong experience in data ingestion, gathering, wrangling and cleansing tools such as Apache NiFI, Kylo, Scripting, Power BI, Tableau and/or Qlik

· Experience with data modeling, data architecture design and leveraging large-scale data ingest from complex data sources

· Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.

· Advanced SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.

· Strong knowledge of analysis tools such as Python, R, Spark or SAS, Shell scripting, R/Spark on Hadoop or Cassandra preferred.

· Strong knowledge of data pipelining software e.g., Talend, Informatica
             

Similar Jobs you may be interested in ..