Getting the incident escalation process right - A guide to an effective escalation process

Ralph Bockisch
Ralph Bockisch
17.07.2024
EcholoN Blog - Den Incident-Eskalationsprozess richtig verstehen - Understanding the Incident Escalation Process - A Guide to an Effective Escalation Process

Learn effective incident escalations and improve your incident management with best practices using useful tools such as knowledge databases and prioritisation matrices.

Author: Ralph Bockisch
created on: 17.05.2024, last change: 26.06.2024

Understanding the incident escalation process correctly

For many, incident management falls into the category of "has to be done, but I'd much rather be working on something else". However, a good understanding of how to escalate incidents appropriately is not insignificant. Reduces obstacles in the incident management process and counteracts confusion.

The first step in incident handling is a critical self-assessment: "Can I resolve the incident directly?" If the answer is "no", the incident should be forwarded to someone who has the necessary expertise or decision-making power to help. There are also tools and methods that increase the likelihood that the answer to this question will be "yes" more often. For example, a well-maintained knowledge database is a valuable aid. It provides tried and tested answers that make it possible to solve even unknown problems. The use of the shift-left approach can also promote the exchange of knowledge and increase efficiency.

Another important aspect is correctly assessing to whom the incident should be escalated. There are basically two main forms of incident escalation: hierarchical and functional escalation.

  • Hierarchical escalation means passing the incident on to someone with decision-making authority. This is particularly useful when major efforts or approvals are required.
  • Functional escalation forwards the incident to a person or team with specific expertise. This can be done either by email or via service management software. Specialised tools even make it possible to escalate parts of an incident to other agent groups.

It is essential to determine both the impact and the urgency of the incident. If the problem is urgent and there is no solution in the knowledge base, the incident should be escalated to a subject matter expert as quickly as possible. An incident priority matrix, as taught in ITIL training courses, helps to set the right priorities and avoid misunderstandings during escalation. It is important to state precisely who the incident affects (the entire company, a location, a department or just one reporter) and how serious the work disruption is (can those affected work, work with restrictions or not work at all?).

Finally, when escalating or passing on the incident, care should be taken to provide as many supporting and detailed records as possible. It makes the most sense to provide a handover log summarising the measures already taken or a kind of checklist. This makes the handover process much easier and ensures that the incident can be processed quickly and efficiently.

Determine impact and urgency - prioritise

Another important aspect of incident management is determining the impact and urgency of an incident. These two parameters then determine the correct priority of the incident using a matrix.

How urgent an incident is is very clear from the perspective of the person reporting it. They see it from their own personal perspective. One person is more emotionally involved and feels massively disturbed in what they were about to do - another thinks that they are certainly not the only one with a concern and that the helpdesk certainly has more to do.

EcholoN provides users with a matrix of urgency and impact so that the priority of an incident is not determined by the reporter's judgement and their ability to pass on their assessment. This helps to evaluate the incident objectively and factually and thus determine an appropriate priority.

Urgency

The urgency represents the assessment of how quickly a process needs to be resolved. Does the incident affect the ability of the people concerned to carry out their work, and to what extent? To what extent are business operations disrupted? Will the outage become increasingly widespread, and possibly quickly? What cannot be done while the incident persists?

Impact

The impact is an assessment of the scope of the report. Who is affected by the incident? The entire company, a specific location, a department or just a single user? How extensive is the fault, what (potential) damage does it cause? Are we at risk of damage to our image or contractual penalties?

The role of detailed records - incident record

An important part of incident management is the conscientious documentation of all work steps. When an incident is escalated, the records are of great importance. They ensure that all relevant information is available to facilitate the handover process and to continue processing the incident without loss.

Firstly, as many details of the incident as possible should be documented. This includes the circumstances under which the incident occurred, the affected systems or applications (CIs) and the initial measures taken. Detailed and precise documentation helps the subsequent processor to familiarise themselves with the incident more quickly.

Communication with the requester and other parties involved should also be included in the complete documentation (emails, chat history and telephone calls) of an incident. References and links to knowledge database articles used, previous incidents or similar problems should also be related to the incident.

In addition to immediate problem solving, long-term analysis and improvement is also possible through detailed documentation. By analysing patterns and recurring problems, proactive measures can be developed to prevent future incidents.  Regular employee training ensures that the quality of the records remains high. A standardised approach facilitates collaboration and contributes to consistency and transparency within the teams.

Swarming approach to problem solving

The swarming approach is a collaborative approach to solving problems. In contrast to the usual approach, where an incident is handled by one person, the swarming approach builds on the advantage of collaboration. The incident is forwarded directly to a team of experts. Teamwork leads to a solution more quickly, as different expertise and perspectives are brought in at the same time.

An important component of swarming is communication. Open and regular communication within the team is encouraged to ensure that all members have the same level of information. Tools such as instant messaging, video conferencing and special collaboration platforms can support this process and facilitate collaboration.

This approach has several advantages:

  • Faster problem solving: simultaneous processing by several experts can significantly reduce the time required to solve problems.
  • More effective use of resources: The expertise and specialist knowledge of team members is better utilised, which increases the quality of the solutions.
  • Promoting knowledge sharing: The swarming approach enables knowledge and experience to be shared and disseminated within the team, which contributes to continuous improvement and learning.
  • Reduction of escalations: As the incident is immediately escalated to a team of experts, frequent escalations to other levels are not necessary.
  • To successfully implement the swarming approach, clear guidelines and processes should be defined. This includes determining the criteria for when an incident is resolved through swarming, as well as the roles and responsibilities of team members during the swarming process.

By using modern communication tools and clear processes, this collaborative approach can be smoothly integrated into existing incident management structures.

EcholoN Blog - Incident Priority Matrix and Resources

Incident priority matrix and resources

The incident priority matrix is an important tool in incident management. It helps to assess the severity of an incident and prioritise it correctly. Using a prioritisation matrix ensures that resources are used efficiently and incidents are processed according to their severity and impact on business operations.

By crossing urgency and impact in the matrix, incidents can be clearly prioritised. For example, an incident with a high impact and high urgency has a very high priority and must be dealt with immediately, while one with a low impact and urgency has a lower priority.

The use of such a matrix offers the following advantages:

  • Clarity and transparency: A clear definition of priorities helps everyone involved to better understand the incident management process and work more efficiently.
  • Better resource utilisation: Resources are allocated appropriately to ensure that the most important incidents are dealt with first.
  • Reduction of confusion: misunderstandings and errors in prioritising incidents are avoided.

The following resources should be available:

  • Up-to-date knowledge bases,
  • Well-trained employees,
  • appropriate tools and technologies to support the incident management process.
  • Regular training and continuous updating of the knowledge database ensure that all team members are always up to date.

External resources such as ITIL standard training provide in-depth knowledge and are also a valuable resource for improving your team's skills. There are also specialised software solutions that can optimise incident prioritisation and handling. Tools such as EcholoN offer advanced incident tracking, resource management and escalation support.