Job Description :
Title - Machine Learning Lead Machine Learning Dev Lead Location Stamford CT Primary Skill - ML Ops PySpark Responsibilities: Coding and architecting of end-to-end applications on modern data processing technology stack: Hadoop, Hive, AWS Cloud, Spark ecosystem technologies Embed with Data science COE / Product to ensure new algorithms / models being built can be supported Reviews design, code, data, features implementation performed by other data engineers in support of maintaining data engineering standards Working with model developers to improve efficiency, making modeling tradeoffs Build continuous integration/continuous delivery, test-driven development, and production deployment frameworks Troubleshoot complex data, cleaning, tagging, features, rules issues and perform root cause analysis to proactively resolve product and operational issues Productionalize the full pipeline including distributed Machine Learning models (e.g., training/test pipeline, data layer, feature layer, etc Connect business context and perspective to define model objective functions, features, business rules, prioritization, measurement, etc. Enforce effective cost optimization techniques in cloud, on-prem, and edge environments to minimize the total cost of ownership of the machine learning product services. Work on analytics application infrastructure analysis & requirements (e.g., configuration, access, tools, services, compute capacity, etc Qualification: 10+ years of experience in software development in an agile environment. 5+ years of Big Data, and Spark experience building and running complex pipelines/applications at scale in production 3+ years experience in Machine Learning, AWS Cloud, EMR, Hadoop, Spark, Kafka, Kinesis, Hive, Redshift, DynamoDB, Lambda 3+ years experience PySpark, Python, HQL, Shell Scripting, SQL, Java / Scala In-depth knowledge of Python data processing and machine learning libraries. Experience with and understanding of the Python ML frameworks such as Scikit-learn, TensorFlow and PyTorch Experience with API design & development Proficient in Spark, Airflow / Oozie, No-SQL, Git/CodeCommit Passion for writing well structured, testable code with a focus on readability and maintainability. Experience implementing ML models and building highly scalable and high availability systems Experience operating in distributed environments including cloud (AWS, GCP etc Experience building, launching and maintaining machine learning pipelines in production Experience working via an agile, sprint-based working style Experience working side-by-side with product owners, and translating business needs into analytics solutions Proven ability to successfully balance near-term results (e.g., ability to design and execute on a 'MVP' model), with long-term goals