Data Engineer

Plano, TX Plano TX 75094

Date : May-29-21

Data Engineer

Plano, TX

May-29-21

Work Authorization

US Citizen
GC
H1B
EAD (OPT/CPT/GC/H4)

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
1099-Contract
Contract to Hire

Job Details

Experience

Senior

Rate/Salary ($)

Market

Duration

Long term

Sp. Area

Data Warehousing/ETL

Sp. Skills

Data Engineer

Consulting / Contract

Direct Client Requirement

Required Skills :

data warehousing

Preferred Skills :

Domain :

IT/Software

Work Authorization

US Citizen
GC
EAD (OPT/CPT/GC/H4)
H1B

Preferred Employment

Corp-Corp
W2-Permanent
W2-Contract
1099-Contract
Contract to Hire

Job Details

Experience

Senior

Rate/Salary ($)

Market

Duration

Long term

Sp. Area

Data Warehousing/ETL

Sp. Skills

Data Engineer

Consulting / Contract

Direct Client Requirement

Required Skills :

data warehousing

Preferred Skills :

Domain : IT/Software

Ar Systems, Inc
Edison, NJ
Post Resume to
View Contact Details &
Apply for Job

Job Description :

Job Title: Data Engineer

Location: Plano, Texas

Key Skills (Must Have):

For both Data Engineer roles below are the key requirements:

We are looking for someone who Must have expertise in PySpark scripting and also have strong skills with Azure Databricks. Someone who can develop complex snowflakes data pipelines is preferred.

Job Description: The Data Engineer will be responsible for building big data pipelines using open-source tools and enterprise frameworks in response to new business requests. This individual will work closely with data scientists and SMEs to define and map data requirements which can be translated to executable data processing pipes.

Design and implementation of specific data models to ultimately help drive better business decisions through insights from a combination of external and internal data assets. This role is also accountable for developing the necessary enablers and data platform in the Big Data Computing Environment and maintaining its integrity across the data life cycle phases.

Daily Responsibilities:

Gather requirements for data integration and business intelligence applications.
Determine and document data mapping rules for movement of medium to high complexity data between applications.
Analyze existing or build new PySpark/Scala/Snow SQL code wherever necessary to evolve existing prototypes into modern scalable data processing pipelines using Snowflake and Databricks
Work directly with the client user community as a data analyst to define and document data
Create reusable software components (e.g., specialized spark UDFs) and analytics applications Support architecture evaluation of the enterprise data platform through implementation and launch of data preparation and data science capabilities
Perform data quality validation. Employ data mining techniques to achieve data synchronization, redundancy elimination, source identification, data reconciliation, and problem root cause analysis.
Build high-performance algorithms, prototypes, predictive models and proof of concepts
Support data selection, extraction, and cleansing for enterprise applications, including data warehouse and data marts.
Investigate and resolve data issues across platforms and applications, including discrepancies of definition, format, and function.

Required Qualifications and Skills:

8+ years of Data Warehousing and Big Data Technology experience.
3+ years of experience with Databricks, preferably Azure Databricks.
3+ of strong PySpark scripting experience.
Strong knowledge of Data Quality Management
Strong understanding and use of databases: relational (especially SQL), and as well as NoSQL datastores
Intermediate knowledge of Snowflake required
Prior experience with data exploration, prototyping and visualization tools: e.g., Zeppelin, Jupyter, Power BI, Tableau
Prior experience with deploying complex data science solutions is a strong plus

Desired Qualifications:

Experience working in telecommunications industry

Education: Bachelors or Master's in computer science or equivalent

Ask the following questions to the candidate:

Do you have a working knowledge of Hadoop and/or Plantir database experience? If yes, please relate to the project on the resume.
Do you have a working knowledge of Snowflake? If yes, please relate to the project on the resume.
Have you migrated data into Snowflake?
Have you developed data pipelines in Snowflake from scratch?
How experienced are you in the PySpark scripting? Please rate yourself on a scale of 10.