Incident Resolution and Quality

Emailer and sometime commentator Jim asks about quality and his service teams. Specifically, he has second and third line support teams that both have what Jim calls an alarming tendency to mark Incidents as ‘resolved’ when they are not resolved at all. Jim asks if I have any practical suggestions, other than just shouting at people.

Firstly, remember I’ve blogged quite a bit recently about Incident Management.

Coming to Jim’s question, I do have some things that can be considered in circumstances like these. The points for consideration are in no particular order.

  1. Ask if you have created or added to the problem yourself by an inappropriate use of metrics. By this I mean you’ve told your team members that you are looking very closely at statistics for who is resolving Incidents, or that you’ve told the team you are looking at timeliness of resolution on an Agent-by-Agent basis (or worse, both). Now when I say ‘you’ve told’ I don’t necessarily mean you stood on a chair and said that's what you were going to do - remember that people gossip and talk informally amongst themselves. So, staff can get this impression by you simply mentioning the statistics to individuals in a negative way such as ‘why have you only resolved 10 tickets this week?’.

Metrics such as Agent performance need to be used with a little bit of caution, as they often don’t tell the whole story. A colleague of mine here at Serio tells a tale from when he worked as a programmer on a large team fixing bugs in an insurance firm. Bugs were logged, assigned to programmers, and then fixed. A new development team manger was recruited and after two weeks the new manager issued a memo to all development and testing staff complaining about ‘poor numbers’ and proceeded to name an engineer. The manager’s mistake was this: the ‘poor numbers’ guy was the brightest and best in the group, and handled some of the toughest jobs that came into the group – therefore his ‘fix rate’ was much lower.

Does this all mean that such statistics should be avoided? Absolutely not. It simply means that they should be used with caution, and you need to be aware of how your staff might regard the use of such statistics.

One positive step is to make sure that you have statistics for the numbers of Incident re-opened, focusing on who the original resolving Agent was, and to use this in conjunction with other Agent performance stats.

  1. Tell your teams that you perceive a problem. Try to bring them onside, and appreciate the need for quality rather than premature fixes. Try to understand how your team members see their role and what pressures they feel.

  2. Consider the roles of Team Leaders. Ask them to review some or all of the Incidents being resolved by their team.

  3. Introduce a 2-stage completion process if you don’t have one. By this I mean that when service teams resolve Incidents, they put the Incident to ‘Pending Complete’ and re-assign back to the Helpdesk or Service Desk. What then happens is we check with the customer proactively to make sure that the fault is resolved.

  4. Consider the possibilities of skills gaps within your teams, particularly the second and third-line support. Examine Incidents that have been re-opened for clues as to why this problem is happening.

  5. Make sure that the Helpdesk or Service Desk is actually re-opening Incidents, rather than logging new ones. I’m saying this but I know it’s hard to do 100% right. I’ve blogged before about having a call handling script – amend your script to ask if Incident have been reported previously, and give staff guidance on when it is right to re-open.

By the way, if you’ve read this post and are thinking ‘this does not affect me’ I have to ask how do you know, and are you sure?