Job ID 221408
Job Title: Major Incident & Problem Manager
Salary Range: £50-55k p.a.
Location: Home based UK
Responsible for managing the major incident process for urgent and high business impacting incidents, ensuring that customer communications, triage, 3rd party escalations and service resolution are all managed with urgency, care & professionalism.
The Major Incident & Problem Manager is responsible for owning incident and problem resolution, working collaboratively with our 3rd party business support partners and across multiple business departments to identify root cause; identify, record, resolve problems and avoid incidents.
The individual is also responsible for pressing the root cause analysis (RCA) research and report to completion within SLAs following a major incident and service restoration.
This role is also responsible for ensuring proactive problem management is in place to enable the employer to identify and remove the root cause of potential problems before they result in incidents.
The Major Incident and Problem Manager is a key member of the operational support team with a long-term focus on avoiding serious incidents and constantly improving service operations, across all facets of service delivery from network to application.
* Manage major incidents from identification to service restoration and closure
* Drive the root cause analysis and review the RCA documentation within SLA
* Regular. professional communication on major incidents to internal & external customers within contracted SLAs
* Communication of incidents to customers in a clear and meaningful way
* Co-ordination between multiple internal and external support teams from IT infrastructure to application management and engineering for effective resolution
* Escalate within support organisation as required.
* Manage customer / support escalated incidents
* Co-ordinate required plan to ensure change
* Obtain necessary approvals from all stake holders
* Communication on any outage due to retrospective change
* Create an action plan for issue remediation during Incident troubleshooting
* Regularly review lower priority incidents across the service base to avoid any emerging higher priority issues and root cause fixes
* Work with alerting & monitoring teams to pro-actively avoid high priority incidents
* Identifying changes in the support processes and suggest improvement changes to the Incident Management process accordingly
* Run major incident ‘blameless post-mortem’ sessions after service restoration to ensure avoidance of repeat incidents
* Lead post incident review meetings with internal & external 3rd party support teams
* Track and manage Problem Records with 3rd party support providers, providing reporting and updates to internal stakeholders
* Drive continuous service improvement and incident avoidances with our 3rd party support providers
* Response & Resolution times for Major Incidents within client contracted SLAs
* Production of Root Cause Analysis documentation within SLA
* Reduction of Major Incidents
* Resolution of Problem Records
* Avoidance of repeat incidents
Requirements for the role:
Strong background in Incident Management working with 3rd party service providers
Understanding of IT Infrastructure and Operating Systems
Strong experience of delivering service quality within an ITIL framework
Knowledge of ITIL, Infrastructure related technologies & understanding of business relevance of the technologies
Experienced in managing multiple vendors
Good experience in managing conference calls or incident resolution meetings
Strong written and verbal communication skill
Strong experience of IT service, operations, and support
Certified ITIL (Foundation) V3 beneficial
Excels under pressure
Knowledge of Major Incident & Definition
Good problem-solving techniques
Excellent written and verbal communications
Excellent Interpersonal and team working capabilities
Focused, pro-active, organised
Passionate about customer care
Agile and able to adapt to the business requirements