Job Title : Machine Learning Engineer
Location: New York City, NY
Duration: Full Time

The Viacom Data Platform is looking for an awesome Machine Learning Engineer with handson experience in developing and maintaining scalable machine learning applications and pipelines in a fast-paced business environment.

The Data Platform is responsible for ensuring the right people have the right data at the right level of granularity across our brands such as Nickelodeon, MTV, BET, and Comedy Central. We enable teams driving video, marketing, and advertising to be data driven and we are implementing practical approaches with machine learning to enhance the work people do here. We take an internal open source approach to our work and believe our code needs to be top quality in order to succeed. Be a part of this growing team in a challenging but fun environment.

Building, maintaining, and optimizing data pipelines and ETL jobs • Working with a text, video, and images to extract useful metadata • Collaborating across the team to shape the Data Platform technology stack • Writing technical documentation • Improving automation and test coverage (unit/integration/user acceptance tests, etc • Keeping up to date with modern data engineering technologies
Your profile:
5+ years of software engineering/data science experience with 2+ years in data engineering plus a Master’s degree in Computer Science (or related field) or equivalent professional experience combined with a Bachelor’s degree • Extensive experience with Python and libraries (e.g. NLTK, gensim, spaCy, and pandas) • Experience with Apache Spark (especially SparkML) and/or Apache Hadoop • Strong experience in NLP, specifically topic modeling and text classification • Experience with build and dependency management tools (Gradle, sbt, Maven, npm) • Understanding of TF-IDF and tools like Apache Solr • Ability to build cloud native apps on Amazon Web Services • Good experience with agile development processes like Scrum and Kanban • Insist on automating everything (testing, Continuous Integration, Continuous Delivery) • Experience in working on the command line in a Unix/Linux environment
Nice to have:
Experience with one or more NoSQL databases such as Cassandra, DynamoDB, MongoDB, Neo4J, or RDF triple stores
Experience with data pipeline tools such as Apache Airflow, Luigi, AWS Data Pipeline
Experience scaling machine learning pipelines