Job Description :

Job Title: Data Engineer

Location: Plano, Texas

Key Skills (Must Have):

For both Data Engineer roles below are the key requirements:

We are looking for someone who Must have expertise in PySpark scripting and also have strong skills with Azure Databricks. Someone who can develop complex snowflakes data pipelines is preferred.

Job Description: The Data Engineer will be responsible for building big data pipelines using open-source tools and enterprise frameworks in response to new business requests. This individual will work closely with data scientists and SMEs to define and map data requirements which can be translated to executable data processing pipes.

Design and implementation of specific data models to ultimately help drive better business decisions through insights from a combination of external and internal data assets. This role is also accountable for developing the necessary enablers and data platform in the Big Data Computing Environment and maintaining its integrity across the data life cycle phases.

Daily Responsibilities:

  • Gather requirements for data integration and business intelligence applications.
  • Determine and document data mapping rules for movement of medium to high complexity data between applications.
  • Analyze existing or build new PySpark/Scala/Snow SQL code wherever necessary to evolve existing prototypes into modern scalable data processing pipelines using Snowflake and Databricks
  • Work directly with the client user community as a data analyst to define and document data
  • Create reusable software components (e.g., specialized spark UDFs) and analytics applications Support architecture evaluation of the enterprise data platform through implementation and launch of data preparation and data science capabilities
  • Perform data quality validation. Employ data mining techniques to achieve data synchronization, redundancy elimination, source identification, data reconciliation, and problem root cause analysis.
  • Build high-performance algorithms, prototypes, predictive models and proof of concepts
  • Support data selection, extraction, and cleansing for enterprise applications, including data warehouse and data marts.
  • Investigate and resolve data issues across platforms and applications, including discrepancies of definition, format, and function.

Required Qualifications and Skills:

  • 8+ years of Data Warehousing and Big Data Technology experience.
  • 3+ years of experience with Databricks, preferably Azure Databricks.
  • 3+ of strong PySpark scripting experience.
  • Strong knowledge of Data Quality Management
  • Strong understanding and use of databases: relational (especially SQL), and as well as NoSQL datastores
  • Intermediate knowledge of Snowflake required
  • Prior experience with data exploration, prototyping and visualization tools: e.g., Zeppelin, Jupyter, Power BI, Tableau
  • Prior experience with deploying complex data science solutions is a strong plus

Desired Qualifications:

  • Experience working in telecommunications industry

Education: Bachelors or Master's in computer science or equivalent

Ask the following questions to the candidate:

  • Do you have a working knowledge of Hadoop and/or Plantir database experience? If yes, please relate to the project on the resume.
  • Do you have a working knowledge of Snowflake? If yes, please relate to the project on the resume.
  • Have you migrated data into Snowflake?
  • Have you developed data pipelines in Snowflake from scratch?
  • How experienced are you in the PySpark scripting? Please rate yourself on a scale of 10.

             

Similar Jobs you may be interested in ..