Job Description :
· Owner of the core company data pipeline, responsible for scaling up data processing flow to meet the rapid data growth at Lyft
· Consistently evolve data model & data schema based on business and engineering needs
· Implement systems tracking data quality and consistency
· Develop tools supporting self-service data pipeline management (ETL)
· SQL and MapReduce job tuning to improve data processing performance
Experience & Skills:
· Extensive experience with Hadoop (or similar) Ecosystem (MapReduce, Yarn, HDFS, Hive, Spark, Presto, Pig, HBase, Parquet)
· Proficient in at least one of the SQL languages (MySQL, PostgreSQL, SqlServer, Oracle)
· Good understanding of SQL Engine and able to conduct advanced performance tuning
· Strong skills in scripting language (Python, Ruby, Perl, Bash)
· Experience with workflow management tools (Airflow, Oozie, Azkaban, UC4)
· Comfortable working directly with data analytics to bridge business requirements with data engineering