Job Description:
Role: Sr. Data Engineer
Location: Dallas, TX/Remote
Duration: 12+ Months
Responsibilities:
· Analyze, design, and build Modern data solutions using Azure PaaS service to support visualization of data.
· Have to work on migration of data from On-prem SQL server to Cloud databases (Azure Synapse Analytics (DW) & Azure SQL DB/Snowflake).
· Create Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform, and load data from different sources like Azure SQL, Snowflake, Blob storage, Azure SQL Data warehouse, write-back tool and backwards.
· Develop Spark applications using Spark and Spark-SQL for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.
· Responsible for estimating the cluster size, monitoring, and troubleshooting of the Spark data bricks cluster.
· Have experience in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
· To meet specific business requirements wrote UDF’s in Pyspark using Python programming language.
· Develop JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data using the Sql Activity.
· Hands-on experience on developing SQL Scripts for automation purpose.