Job Title: Production Support Engineer
Job ID: 37225
Location: El Segundo, CA 90245
Duration: 12-24+ Months with possible extensions
Interview Process: Phone/WebEx
Number of Positions: 3
***DIRECTV***
*** Remote positions
***They will be required to work west coast shift hours, 10 hours 4 days a week, nights & weekends.
Required Skills:
Production Support Engineer:
Scripting languages (SQL/python/UNIX shell scripts):
Monitoring tools (Grafana/New Relic/Moogsoft/Splunk/Elk/Ineoquest/Harmonic VOS):
Data Analysis/Systems Analysis exp:
Top 5 Skills:
Experience with troubleshooting and systems analysis
Experience with scripting languages such as SQL, python, and UNIX shell scripts
Experience with event-based monitoring and navigation of tools to analyze and identify faults on all OTT streaming platforms
Excellent interpersonal and communication skills – written and verbal
Ability to make decisions and work under high stress situations
Experience with navigation of monitoring tools and dashboards, such as Grafana, New Relic, Moogsoft, Splunk, Elk, Ineoquest, Harmonic VOS to analyze and identify faults on all OTT streaming platforms
Experience with Data Analysis and/or Systems Analysis
Familiarity with OTT streaming products and systems
Preferred Qualifications:
Associate or BS Degree in Computer Science (CS) or related discipline
1-2 years’ experience with scripting languages such as python and UNIX shell scripts
Experience with navigation of monitoring tools and dashboards, including ServiceNow, Moogsoft, Grafana, New Relic, Moogsoft, Splunk, Elk, Ineoquest, Harmonic VOS to analyze and identify faults on all OTT streaming platforms
Familiarity with OTT streaming products and systems
Understanding of network concepts and implementations
Understanding of software development principals
Understanding of cloud concepts and implementations
Familiar with Agile software development practices
Attention to detail and completing work with high degree of accuracy
Ability to efficiently multi-task and succeed in high pressure situations
Excellent interpersonal and communication skills
Proactive, flexible, innovative
Passionate and empathetic about DIRECTV customers
Possess an “everything has a solution” mentality and fearless ‘go-getter’ attitude
Roles & Responsibilities:
1) Providing data network operational support, design, engineering, and planning for data network and communications projects involving TCP/IP and related protocol connectivity for networks supporting Client's customers.
2) Responsible for the management of the interoperability between Cisco and Juniper Layer 3 Platform, through the use of alarm and ticket systems, individual designed customized scripts, customer notification, and business partner escalations.
3) Troubleshoot complex cross tower issues which may or may not have data network root causes.
4) Providing real time in-depth analysis and real time trouble resolution of incidents associated with the CISCO, Juniper, and associated Operations Support Systems, and Data Communications Network Technology platforms.
5) Onsite technical escalation point for the remote LAN, WAN, Firewall, and DNS/DHCP teams.
6) Manage fault and change ticket queue to ensure proper follow up is performed within required time intervals.
What You Will Do @ DIRECTV Video & Technology Streaming Operations:
• Be a member of the 24X7 team that is responsible for production support of all streaming products on the Open Video platform (currently: DIRECTV STREAM, DIRECTV Everywhere, and NFL Sunday Ticket)
• Investigate & troubleshoot production issues, including deviations in KPIs on Headend and clients. utilize information in tools to further isolate the source of errors and effectively direct troubleshooting efforts.
• Utilize multiple tools to monitor the health of each platform and, when incidents are detected, take action to mitigate through first touch resolution or escalating to respective service owners.
• Take ownership of Incident & problem management for the platforms/products, including communications, documentation and bridge facilitation with the goal to minimize customer impact
• Maintain high accuracy of incident data in tools
• Analyze risk and assign appropriate impact based severity to incidents
• Identify alerting & monitoring gaps and assist responsible teams to close gaps by taking action and/or through feedback loop
• Identify opportunities for automated remediation of common platform issues and assist in development of remediation scripts when possible
• Develop documentation and knowledge base for OpenVideo systems to foster expanded troubleshooting capabilities across the engineering and operations organizations
• Develop and deliver training and share knowledge to help increase the skill level of the team
• Work with development and operations teams to drive constant improvement in reliability, operability, performance.