Job Description :

Job Summary:

We are looking for a highly skilled Ops Engineer with strong expertise in Terraform, Databricks, MLOps, and Large Language Models (LLMs). This role is ideal for someone passionate about automating infrastructure, optimizing data and machine learning workflows, and supporting the deployment of advanced AI models in production environments. You will work closely with data scientists, ML engineers, and DevOps teams to enable scalable, secure, and reliable ML and LLM infrastructure.

Required Skills:

  • Strong experience with Terraform (Cloud/Open Source) for cloud infrastructure automation.
  • Proficiency with Databricks: clusters, jobs, notebooks, DBFS, and workspace management.
  • Hands-on experience with MLOps tools (e.g., MLflow, TFX, Kubeflow, Airflow, or SageMaker Pipelines).
  • Experience deploying and scaling Large Language Models (LLMs) in production environments.
  • Proficiency in CI/CD tooling (GitHub Actions, Azure DevOps, GitLab CI, etc.).
  • Scripting and automation using Python, Bash, or similar languages.
  • Knowledge of cloud services (AWS, Azure, or GCP), especially for ML workloads.
  • Understanding of containerization (e.g., Docker) and orchestration (e.g., Kubernetes) for model deployments.
  • Strong problem-solving, troubleshooting, and documentation skills.

Equal Employment Opportunity:

We are an equal opportunity employer. All aspects of employment including the decision to hire, promote, discipline, or discharge, will be based on merit, competence, performance, and business needs. We do not discriminate on the basis of race, colour, religion, marital status, age, national origin, ancestry, physical or mental disability, medical condition, pregnancy, genetic information, gender, sexual orientation, gender identity or expression, national origin, citizenship/ immigration status, veteran status, or any other status protected under federal, state, or local law.

             

Similar Jobs you may be interested in ..