Job Description :

Position: Lead Data Engineer

Location:   Bellevue, CA, NJ or Dallas. ( Hybrid Role )

Length: 1 Year +

The lead data engineer will be responsible for the end-to-end design, build, monitoring and support of Big Data architecture including platform and applications. The candidate needs to be self-driven and extremely hands on. 

SUMMARY OF ESSENTIAL JOB FUNCTIONS 

  • Function as the Technical Decision Authority and own the data strategy and road map, data architecture definition, business intelligence/data warehouse solution and platform  
  • Ensure deployment of modern data structures and models to enable reliable and scalable data products and feature stores 
  • Monitor end-to-end application stack (Hadoop, Vertica, Tableau, Hue, Superset etc.) and lead/provide end-to-end operation support for all applications.  
  • Develop processes for automating, testing, and deploying the work 
  • Identify risks and opportunities of potential logic and data issues within the data environment 
  • Collaborate effectively with the global team and ensure day to day deliverables are met   
  • Lead other team members and provide technical leadership in all phases of a project from discovery and planning through implementation and delivery. 
  • Lead the resolution activities for complex data issues. 
  • Develop and maintain documentation relating to all assigned systems and projects 
  • Ability to independently perform proof of concepts for new tools and frameworks and present to leadership 
  • Work in an agile environment. Develop and drive multiple cross-departmental projects.  
  • Establish effective working relationships across disparate departments to deliver business results. 

 

MINIMUM REQUIREMENTS 

  • 7+ years as a Lead Data Engineer
  • Airflow Frameworks (required)
  • Top 3 technologies: Spark/Python/Shell (required)
  • Spark experience required (pyspark required) 
  • Good knowledge of open source  
  • Strong Experience with big data tools and data processing: Hadoop, Spark, Scala, Kafka, ShellYarn cluster, Java, etc. 
  • Ability to debug shell scripts, hive jobs and hadoop issues a must 
  • Experience with object-oriented/object function scripting languages: Python, Java, Scala, etc 
  • Experience on SQL/NoSQL databases like Vertica, Postgres, Cassandra etc.  
  • Experience with data modelling highly preferred 
  • Experience on streaming /event driven technologies work such as Lambda, Kinesis, Kafka, etc. is a nice to have 
  • Nice to have but not required - exposure to ML (frameworks like pytorch/tensorflow), model management and serving, containerizing and application development experience.  
  • 3+ years of manufacturing/Hi-tech experience preferred 
  • Prior experience as a senior data architect, technical lead, system architect, or similar is required 
  • Excellent verbal and written communication skills.  

 

 

             

Similar Jobs you may be interested in ..