? Performs day-to-day management of the cloud platform operations, identifies issues and risks and recommends possible issue and risk mitigation strategies associated with the project.
? Lead operations team in providing tier 1 and tier 2 support for cloud platforms.
? Design, Develop and maintain incident management workflows
? Monitors issues and provides resolutions for up-to-date status reports.
? Coordinate emergency response for severity 1 and 2 incidents
? Monitor and coordinate all system operations, including security procedures, and liaison with infrastructure, security, DevOps, data and application teams.
? Ensure that necessary system backups are performed, and storage and rotation of backups is accomplished.
? Monitor and maintain records of system performance and capacity to arrange vendor services or other actions for reconfiguration and anticipate requirements for system expansion.
? Coordinate major/minor software installation, upgrade and patch management.
? Must demonstrate a broad understanding of client IT environmental issues and solutions and be a recognized expert within the IT industry.
? Must demonstrate advanced abilities to team and mentor and possess demonstrated excellence in written and verbal communication skills.
? At least eight (8) years of experience in managing on-premise and cloud based multi user environments with expertise in planning, designing, building, and implementing IT systems.
? At least eight (8) years of product administration/management experience in RHEL Linux based environment or Windows server environment
? Must demonstrate a broad understanding of client IT environmental issues and solutions and be a recognized expert within the IT industry.
? Must demonstrate advanced abilities to team and mentor and possess demonstrated excellence in written and verbal communication skills.
? 5+ years of experience leading tier 1 and tier 2 support for cloud platform
? 3+ years of experience on operations large platform of over 2000 instances
? Experience working with incident management and change management tools
? Experience developing and maintaining incident management and change management workflows
? Experience working with incident triaging and knowledge management
? Experience working with security teams to perform security monitoring, audits and remediation.