Job Description :

Job Title: Applications System Engineer – Cloud  (Cloud Systems Engineer)

Job ID: 37801

Location: El Segundo, CA 90245

Duration: 6-12+ Months with possible extensions 

Interview Process: Phone/WebEx

Number of Positions: 1

*** DIRECTV position

*** Prefer candidates local to Los Angeles, CA, but any location will be considered.

*** No OT (unless pre-approved)

Required Skills: 

Cloud Systems Engineer:

Kafka Administration:

AWS:

Security protocols Kerberos:

Scaled Agile framework/Agile scrum methodologies (a plus): 

Top 5 Skills / Additional Job Posting Description Details * 

Top skills required:

Kafka Administration

AWS

Security protocols Kerberos

Additional job description:

Design, develop, deploy, debug messaging and streaming deployments using Kafka.

Build Kafka Zookeeper nodes, Kafka connect, schema registry setup and high availability cluster setup.

Administrating Kafka platform which includes creating a backup & mirroring of Kafka Cluster brokers, broker sizing, topic sizing, h/w sizing, performance monitoring, broker security, topic security, consumer/producer access management (ACL) .

Understanding of cloud environment(AWS) and management, container technologies like Docker and Kubernetes, monitoring tools like Grafana, Prometheus

Experience with Kerberos security configuration is a plus

Experience with scaled agile framework and agile scrum methodologies a plus 

Roles & Responsibilities:  

1) Support new cloud solutions - involves UAT, ORT testing and driving operational requirements. Outputs include Test Plans, Test Cases, Test Execution, Production Verification, Developer and Administrative Portals.

2) Maintain overall platform stability/security/ supportability. Interface with development teams, vendors, product, sales, and customers. Engineer operational requirements, test cases, technical documentation provide operational support.

3) Monitor, maintain, & support cloud solutions Ensure standard processes exist for systems installation, configuration, patching, and upgrading.

4) Develop guidelines for systems monitoring for maximum availability, uptime, and performance. Research, troubleshoot, resolve system issues/outages.

5) Support of web portals, orchestration, platform, application, API, network, and monitoring needs. Platform support (monitoring capabilities, performance tuning, capacity planning, and maintaining overall service and platform health).

6) Facilitate post-outage reviews (POR), manage Service Incident Reporting (SIR), develop, and deliver Root Cause Analysis (RCA) to management and to the customer.

             

Similar Jobs you may be interested in ..