Job Description :
Hi,

Please find below job description and share profiles:

Skill set is as provided below.

They are looking for 4 resources in San Jose, CA

Position: New SOT Staffing
Duration: 3 to 6 month
Location: San Jose, CA

SKILLSET
Knowledgeable in cloud/OpenStack/technologies: Nova, Neutron, Keystone, Cinder, Glance, Swift, MySQL,
Previous experience in Prod Linux. Monitoring like Zabbix, Sensu, Grafana, splunk.
Engineering skills
Experience with at least one high level programming language like Python, Ruby, or Java (Python is preferred)
Ability to work as a member of a large, distributed and diverse team
Good understanding of software development fundamentals
Capacity for learning new technologies and skills rapidly is key
Prior advanced-level experience with Linux
Prior advanced-level experience with OpenStack
Prior advanced-level experience with configuration management systems Puppet (Puppet is preferred)
Prior advanced-level experience with computer networking, and storage
2+ years advanced-level experience with OpenStack
7+ years advanced-level experience with Linux

Must:
Experience with at least one high level programming language like Python, Ruby, or Java (Python is preferred)
Prior advanced-level experience with Linux
Prior advanced-level experience with OpenStack

ROLES AND RESPONSIBILITIES

Monitor email alerts/live dashboards and escalate/resolve issues by following SOPs.
Monitor Projector screens for key business metrics deviation
Periodic health check and pro active reviews of monitoring reports
Perform regular audit of site for DR and Compliance symmetry
Create and maintain SOP documents
Perform LTS (live to site) checks for on boarding new systems in cloud
Perform scheduled cloud partition maintenance, cloud object stats gathering and cap adds
Handle Service Now and Jira tickets
Attend team meetings
Physically present for each shift.
Warm handoffs when the shift ends
Communicate to TDO (Technical duty officer) before making any changes to cloud during incident triaging.

INCIDENT MANAGEMENT

Create incident and restoration tickets and manage the ticket queue
Drive engagement with infrastructure partners like infra Ops, storage, network and key stakeholders teams for incident resolution
Vendor engagement to drive incident resolution (Collect logs, Create vendor tickets, Upload all diagnostic data, Monitor vendor tickets, Attend vendor calls for driving RCA)
Maintain list of system/storage/database bugs that are vulnerable to PayPal site
For Tier 1 ATB (Availability to Business) impacting incidents, follow the escalation communication protocol
Join the main bridge in the event of functional or site wide TCB

INCIDENT MANAGEMENT

File Change ticket Emergency Change ticket) for site restoration and maintenance work adhering to PayPal policy
Follow up for necessary approvals of the change tickets
Align resource for change execution and validation
Follow appropriate communication and execution protocol during change execution
             

Similar Jobs you may be interested in ..