Data Engineer with Python and hands on experience using MLFlow - the project is to set-up experiments using MLFlow with ability to save artifacts, use MLFlow API's, configure MLFlow and Databrics funcitioning for running experiemnts..
Job Description:
We are seeking a skilled Hybrid Data Engineer to manage and optimize our machine learning lifecycle across a hybrid environment of on-premises open-source tools and cloud-based platforms.
The ideal candidate will have experience working with MLflow, Databricks, and open-source machine learning frameworks, enabling seamless integration, experiment tracking, model management, and deployment.
Responsibilities:
- Design, implement, and maintain a hybrid ML infrastructure that connects on-premises open-source tools with cloud platforms like Databricks.
- Enable experiment tracking, model versioning, and lifecycle management using MLflow integrated with Databricks notebooks.
- Manage the registration, versioning, and deployment of machine learning models across environments.
- Collaborate with data scientists and ML engineers to streamline workflows and ensure scalable, reliable model deployment.
- Configure MLflow servers and APIs to connect securely and efficiently with existing data sources and deployment targets.
- Optimize data pipelines and model deployment processes for both on-premises and cloud environments.
- Ensure best practices for reproducibility, security, and scalability in ML workflows.
Qualifications:
- Proven experience with MLflow, Databricks, and open-source machine learning frameworks.
- Strong understanding of data engineering, model lifecycle management, and cloud computing.
- Experience with deploying and managing machine learning models in hybrid environments.
- Knowledge of Apache Spark, Python, and Linux/Unix systems.
- Excellent collaboration and communication skills.