Job Description :
Experience with alerting and reporting of a production service- 2+ years

· Scripting and software development ability to automate- 2+ years

· Automating reports and show where all the issues have been- 2+ years

Develop software programs of a complex nature, including operating systems, applications, and/or network products. Design and perform analysis on moderately complex programs and systems. Analyze performance bottlenecks and system failures. Primary role in smaller, low risk projects. Contribute to the development of processes and methodology. Assist in the development of assignments and schedules.

- Own support and monitoring of production and dogfood service
- Proactive investigation of errors, warnings and signals from all telemetry (azure + internal)
- Management of alerts on appinsights: writing and tweaking queries to adapt to changes in service and to reduce false positives
- Monitoring of azure security center. Filling bugs to capture real issues to be fixed by engineering team
- escalate issues if customer is potentially impacted. Track progress/resolution on CRM. Escalate to engineering team on vsts if necessary
- monitor SQL database and app services compute consumption and upgrade capacity as needed
- create tickets on azure support/premier if necessary, (e.g azure quotas,
- Writing alerts to detect warning and error conditions in the service
- Developing metrics and reports to show service performance
- Creating scripts to quickly parse logs to aid in diagnosing common causes of problems


Client : c

             

Similar Jobs you may be interested in ..