Job Description :
Design, develop and operate our data warehouse infrastructure
- Implement and operate data pipeline for collecting millions of metrics per day from client applications and users
- Architect our next generation analytics and machine learning platform
- Write code to develop new software products and/or features, manage individual project priorities, deadlines and deliverables.
- Work with data scientists and analysts to put machine learning models into production
- Build tools, frameworks and dashboards to support running experiments and analyses
- Lead data governance and review all data and database schema changes.

Qualifications:
- Experience setting up and managing multi-TB data warehouses and HDFS clusters. (experience with Azure Data Warehouse or Redshift, preferred)
- Experience with ETL processes to transform data from a variety of data sources to a normalized form
- Experience building data pipelines leveraging open source technologies like Kafka, Hadoop, Hive, Pig, and Spark.
- Experience working with RDMS databases (e.g. PostgreSQL or SQL Server), managing connection-pools, performance tuning and optimizations.
- Experience with database design and SQL/NoSQL data-stores
- 3+ years programming in SQL and Python
- BS Degree in Computer Science or related technical discipline or equivalent work experience.
- Large-scale systems software design and development experience, with experience in Unix/Linux.
Preferred Qualifications:
- Experience building data pipelines leveraging Azure/Microsoft technologies (e.g. Azure Data Lake, HDInsights), a plus.
- Experience with machine learning libraries such as Mllib, scikit-learn, TensorFlow, and/or Keras, a plus
- Experience with modern visualization tools such as Grafana or Kibana, Jupyter notebooks
- Experience with data governance and managing large data schemas, a plus
- Experience working with Python and Spark in a production environment.
             

Similar Jobs you may be interested in ..