The Cloud Administrator SME (Onshore) is a critical, customer-facing technical expert responsible for the 24x7 operational health, effective L2 incident management, and onsite support of our hybrid, multi-cloud infrastructure supporting mission-critical Healthcare systems like EPIC Electronic Health Record (EHR) workloads across Azure, GCP, and OCI. This role requires hands-on expertise in laC, container orchestration (GKE), and ensuring absolute compliance with HIPAA and other healthcare regulations, with a strong focus on high-touch coordination during major hospital events.
Key Responsibilities:
•Serve as the primary L2/L3 Onshore technical expert for all cloud infrastructure and EPIC workload incidents, ensuring rapid diagnosis and resolution to minimize impact on clinical operations.
• Lead communications during active incidents, providing clear, concise, and professional updates to hospital I! teams, dlinical stakeholders, and internal leadership
• Perform onsite, hands-on support during critical periods, including EPIC go-lives, major version upgrades, patching windows, and disaster recovery drills.
• Coordinate directly with EPIC technical staff, hospital IT teams (network, security, application), and vendors to resolve complex cross-functional dependencies
• Daily administration, monitoring, and proactive remediation of laas/Paas services across Azure, GCP, and Oracle Cloud infrastructure (OCi) supporting EPIC environments.
• Operational expertise with Google Kubernetes Engine (GKE): Manage the deployment, scaling, healt, and operational troubleshooting of containerized EPIC components or supporting microservices.
• Perform resource provisioning, capacity monitoring, and infrastructure tagging for cost management and chargebacks across all multi-cloud tenants.
• Maintain and validate IAM, backup, and geo-redundant Disaster Recovery (DRj/Business Continuity Planning (BCP) mechanisms for EPIC and integrated hospital systems
• Implement and enforce security baselines and compliance controls (e.g., HIPAA, SOCZ, CIS, NIST) at the infrastructure layer across Azure, GCP, and OCI.
• Drive automation efforts for routine operational tasks and provisioning using Terraform for laC and scripting te (PowerShell, Python) to ensure operational consistency and auditability.
• Manage log monitoring, correlation, and initial incident response for security and performance events within EPIC cloud environments.Qualifications & Technical Skills:Qualifications & Technical Skills:
• Tools: Terraform, Github, Ansible, Tanium, PowerShell, YAML, Bash, Python.
Qualifications & Technical Skills:
• Experience & Environment: /-10 years in infrastructure operations, with at least 5 years dedicated to Cloud Administration and 5+ years supporting EPIC CHR or similar mission-critical clinical systems in a hybrid environment.
• Multi-Cloud Hands-on Expertise: Proven expertise in the administration, configuration, and operational suppe of Microsoft Azure, Google Cloud Platform (GP), and Oracle Cloud Infrastructure (OCI)
• Containerization: Strong practical experience in the operational management, troubleshooting, and administration of Google Kubernetes Engine (GKE) clusters.
• EPIC Systems Knowledge: Solid understanding of EPIC infrastructure requirements, deployment topologies (eg
Clarity, Chronicles), and operational dependencies.
• Automation & Scripting: Highly proficient with:
• laC: Terraform for multi-cloud deployments.
• Scripting: PowerShell, Python, YAML, and Bash
• Compliance: Deep knowledge of HIPAA requirements and operational procedures necessary to maintain compliance in a cloud environment