Key Performance Indicators (KPIs) for Problem Management

Customer Nick asks for ideas on data he can use as a KPI for Problem Management.

That’s a really good question, and before answering it I’ll refer to the definition of what a Problem is. However, the definition of a Problem is slightly different to what the goals of Problem Management are, namely (taken from the ITIL Service Support book):

“The goal of Problem Management is to minimise the adverse impact of Incidents and Problems on the business that are caused by errors within the IT Infrastructure…”

In other words, if you ask ‘why do we bother with Problem Management’ the answer is ‘to try to stop Incidents from happening (thereby avoiding their cost and inconvenience)’. Therefore, any KPI should be targeted toward measuring the effectiveness of the Problem Management process – just as Incident Management KPIs have an element looking at timeliness of resolution.

But hold on a minute, it looks like we’ve just described something very difficult. Our ideal KPI would show how many Incidents had not occurred due to our Problem Management process, but if you can get a report from your Helpdesk or Service Desk software telling you how many Incidents have not been logged then all I can say is wow.

I have seen instances where Problem Managers have shown a declining number of Incidents say over a 3 month period, and concluded this is a direct result of Problem Management. All I can say is maybe. Incidents can decline for all sorts of reason such as

  • Investment in the infrastructure, or technological changes unconnected with Problem Management
  • Less business activity (like after a Christmas rush period)
  • Less business or technological change
  • Fewer new employees joining

…and so on. The converse of the examples above can lead to an increase in Incidents, even if your Problem process is bang on the money. Personally, disentangling all these reasons is so hard and unscientific I don’t think it’s a good avenue to follow for anyone.

Instead, I guess we’ve got to use some common sense. Common sense tells us that if we have an effective process in place then Incidents and downtime costs will reduce. So to me it makes sense to focus on ‘is the Problem Management process working?’.

Having described the challenge at length, Friday’s post will describe some actual Problem Management KPIs you can use.