Serio Blog

Wednesday, 28 Feb 2007

This is a follow-up post to improving your chances of success with an ITSM project.

One thing I want to clarify is about the topic I’m expanding upon: improving the chances of success – so that doesn’t mean I’m going to try to address how you should go about an ITSM project (which will depend on your current position, skills, budget, and organisation).

I’m interested in improving the chances of success, like you might say ‘improving the chances of cycling home safely in February’ might be ‘have good lights, don’t drink alcohol beforehand, wear reflective clothing, avoid busy and narrow roads, or roads with vehicles travelling at high speeds’. None of these things guarantee the outcome I want cycling home, but they do improve my margin of success.

On Monday I blogged about objectives, and how important they are. It occurs to me that another use for a good set of objectives is at the end of the project, as a way to judge if you’ve achieved the things you needed to at some point in the future.

Today I’m going to blog about what I consider to be akin to a mortal sin – complexity.

Complexity seems to be a very common personal trait amongst managers. However, without question the best and brightest people I’ve worked with have been the ones with the insight to make things simple.

An example is when someone says ‘I have come up with an Incident Management process’ (I’ve blogged about Incident Management before).

Often, with a flourish, a flowchart is often produced. I’ve nothing against flowcharts (Serio has some pretty nice ones for example) but Helpdesk and Service Desk managers do seem to be intoxicated at times by them. These flowcharts twist and turn, doubling back on themselves, looping and branching, applying different (and obscure) status values before coming to an end. In my experience these are almost always unnecessary and destined to be ignored by Incident Management staff who are getting-on with the business of resolving Incidents and dealing with customers.

Instead focus on customers, restoring service, how we organise teams, how do I develop a sense of quality and ownership, how do we assign between people, how can I develop my team leaders, who owns the tickets, what responsibilities do people have. Be outward looking.

More than anything, keep things simple. If you have a team at the smaller end of the scale, don’t regard this as a problem – instead view it as an advantage by making your procedure and processes simpler. Adopt an attitude that your staff are trained correctly and know how to do their jobs. If problems become evident early on respond to them, but don’t try to anticipate problems in advance.

Monday, 26 Feb 2007

An emailer I’ll refer to as Peter asks what he can do to ‘improve the chances of success’ with his ITSM project. He describes his help desk as being reactive ('being honest, we just fix things when they break') and describes his users as unimpressed with the quality of IT Service. He adds that ‘I have no additional budget at all, and face losing one member of staff through cost cuts over the next 6 months. I have a simple helpdesk tool, and cannot buy anything else. But I need to start doing more and providing a more professional support service or I won’t be here next year’. Peter has 8 staff, and with staff costs his IT spend last year was £280,000 – this year it will be less to an unspecified amount.

First of all Peter, there are a lot of resources like this blog that are free that you can use – a spot of googling will locate them. I’d start by saying have a look at the IT Service Management category here, and at some of the white papers you’ll find on our home page.

Peter specifically asks for things that will improve his chances of success, and appears to be familiar with ITIL – indeed he mentions it twice in his email.

Firstly, I’d say you should be clear in what you want to achieve, and be wary of seeing ITIL (or any other service management framework) as an end in it’s own right – you should not be aiming to be ‘ITIL compliant’ because that of itself my not deliver what your company needs. Instead, look at things such as ITIL as a way to help you deliver improvements to service.

My editorial guidelines for blog articles require me to be practical, and so I’ll try to expand the paragraph above and be both practical and specific. When I say ‘be clear what you want to achieve’ I mean set real, tangible, specific objectives for yourself, and avoid generalities like ‘improve Incident handling’. Specifics might include things like:

  • Reduce the number of Incidents measured month-on-month by 10%
  • Improve our first-time fix rate
  • Reduce the downtime experienced on our key systems
  • Reduce the amount of time taken to resolve Incidents
  • Find out why our users/customers feel the way they do about us

..and so on. The objectives you set yourself will probably be linked to the weaknesses of your current IT support operation as you see it, but be specific and be ‘outward looking’. One of the traits which successful managers involved in ITSM seem to have is that they are outward looking – towards the organisation, business, and customers.

Properly set out, your objectives will help to keep you focussed and on-track.

I’ll continue this interesting topic in later posts.

Friday, 23 Feb 2007

Whilst the above title might sound like an advert for expensive chocolate, I’d say that Description Templates really are good things – and an often overlooked feature.

First of all, this is what Description Templates do: they are like a ‘speed click’ for the description of an Incident, Problem or Change (and more, as we’ll see below) – to save you having to type in manually. They are also useful as prompt or mini-scripts for those logging Incident from customers.

For instance, you might want to manually type this Incident description:

The customer has forgotten their password on the domain server, and ended-up getting themselves locked out. I validated the caller’s ID and re-set their password. They logged-in whilst still on the phone

or with a few clicks you could use this ‘standard’ text whilst logging the Incident.

Another use is as a prompt to help you capture the right information at the time of logging a ticket, as in this example:

Application Version: XXXX

Error Message: XXXX

Operating System: XXXX

Screen reference or ID (If known):

and so on – so that if you escalate the Incident to second-line support, they have everything they need in front of them.

Using an Description Template is easy. When logging the ticket, look at the field where you enter the Incident Description and you’ll see the standard Serio look-up button marked with an ellipsis (…). You simply click that and choose the 'canned' description you want.

If that lookup window is empty when you click it, then your Serio Administrator needs to set one up for you – so ask them nicely. However, don’t just say ‘Can we have some of them Description Template things please?’ write what you want (the whole text) in an email, and send it to them.

Description Templates can also be used in Actions. In just the same way that they can speed-up Incident logging, you can also speed-up taking Actions as well. It’s done this way: your Serio Administrator can link a Description Template to an Action, so that when you take the Action the comments field is pre-populated for you.

Again, if you think this is a good idea send a full and complete request to your Administrator – it only takes a few minutes to set-up.

Tuesday, 20 Feb 2007

This is a follow-up to my earlier Interviewing post – I said at the end of that I would suggest some topics that an interviewer could use. If you are going to use any of these, please keep in mind all my do’s and don’t from the earlier post.

If you imagine that you are interviewing for an Incident Manager to take over a team of (say) 6, in an organisation where IT Service Management is developing and not yet totally mature - I'm hoping these questions/topics will help.

Also note that how you describe the job is important – for instance, you might have said ‘experience of IT Service Management techniques, and their practical application’ for an organisation of your size. What you must do is signpost the fact that ITSM is part of what you are looking for - it's not nice to suprise people at interview.

Topic: As I’m your supervisor, what might you include on a management report to me?

I might start by asking this, and discuss some of the Key Performance Indicators (KPIs) that he might include (if you are asked ‘what do you want?’ reply you are looking for suggestions). All I would be looking for here are some ideas that show the candidate has thought about this, and prepared (or read) such a report previously. If the only suggestion offered was ‘SLA resolutions on time’ I might be a little disappointed – there are many other measures that tell you a lot more.

Topic: What would you expect to have responsibility for as Incident Manager?

This is potentially a little controversial, so explain you are simply looking for sensible suggestions and have nothing fixed in your mind. What you are looking for here is for the candidate to show they have a good grasp of such a role, and to suggest things such as the Incident Management process, production of reports, day-to-day Incident team supervision, taking a lead in Major Incident handling and so on.

Topic: What strategies might you adopt for specialist Incident teams?

There are lots of ways to phrase this, such as ‘we have teams organised into technical specialities, and some of them are hopeless’ or ‘Our second line support teams has a record currently of poor customer service’. Regardless of the way you phrase this, what you are getting at is how the Incident Manager will work to ensure an acceptable level of quality and service that he or she may not control directly (of course, your candidate might have suggested a supervisory role of such team in an earlier response).

Topic: You find out that Incidents are marked as resolved when they are not. What do you do?

Again, you should be looking for practical suggestions (see my earlier post on Incident Resolution and Quality) that show a methodical and pragmatic candidate.

There are many, many more you can use. Note that all of the above are open-ended with no absolutely right or wrong answer – they simply allow candidates to express themselves, and to show a little bit of creativity and interest in the subject.

One of the things that I’ve found somewhat off putting in the past is direct quotation from the written ITIL sources as a holy text –it’s usually the less confident candidate who does this. What I have found works best is to find someone with ideas and an interest in what they do.

Monday, 19 Feb 2007

I wrote previously about Service Level Management reports. I’m going to pick up this topic again for this post, looking at some of the more detailed or complex reports on this subject.

Once more, these reports will be located within SerioReports – see the earlier post for my comments on this. As with the previous reports, the reports below are grouped under ‘SLA Analysis’.

I will start with SLA10. This report focuses on the timeliness of resolution of Incidents, on a calendar month by calendar month basis. It is a histogram-style report, and shows the percentage of resolutions on time (by Priority) for each month. It allows you to add your target (for instance, 90%) to the report, so that the actual outcomes can be compared to the target. As the report shows month-on-month percentages it is useful in showing trends (improving or worsening situation over time).

People are sometimes puzzled as to why more than one target line can be added to the report. This is because some customers define variable targets for SLA resolutions, as in this example:

Priority Critical – Target 99% on time

Priority High – Target 95% on time

Priority Medium – Target 90 % on time

and so on.

Other Serio customers simply have a target that is the same across all Priorities. Either way it’s fine – the report works for both groups.

As a final flourish, you can add averages across each month across all Priorities.

Another more complex report is SLA7. This report also reports on timeliness of resolution, but includes information, banded by Priority, about each Escalation level reached. This is useful because it is not good practice to focus on only on what was on time and what was not. Service level managers should have further interest as in this example:

For the Incidents that were not on time, just how late were they?

For instance, it is not as bad to have Incidents being resolved 30 minutes after their target as it is to have them resolved 30 days after their target. One way to achieve this, assuming you’ve set-up your Escalation points sensibly, is to use SLA7. Alternatively, you can explore this ‘lateness’ concept by running report SLA12.

Friday, 16 Feb 2007

In an earlier post I touched on the subject of interviewing, in response to a commentator's question. This turned out to be a really popular article, with a significant number of unique visitors and email correspondence generated. Thank you all for your comments.

I'm going to revisit this subject today. Our internet logs tells us that quite a lot of people are searching for information such as 'interview questions' or 'how to conduct a service desk interview' and I want to write an article that might help. However before that I want to talk about spambots and recruitment firms.

One thing the earlier post did was generate, after a period of about a week, a very large number of calls from recruitment firms – all asking for either 'Pete' or 'George', all adamant we were looking for a new Service Desk Manager, and all very, very persistent. At first we didn't understand this, but then someone explained it to us. There are a significant number of what I will call spambots that are looking for vacancies on websites, and clearly they've been crawling ours (bots are programs that read all of the pages in a website). These bots are clearly trying to read the text, and then sell the 'opportunity' to recruitment firms. However in this case the bots seem to have misread the post.

OK for the benefit of bots: please do not call us, this post does not mean we have a vacancy smiley

Now on to the subject of interviewing. It's always surprised me that most people who are interviewing have had no training, and very little guidance. Presumably you are meant to learn the technique from the times you've served as the interviewee. What I'd say is that if it's your first interview, don't be afraid to role-play with a colleague (it's best done with a peer). Get your colleague to sit with you playing the role of a candidate, and afterwards discuss with them how the interview went. This will help you get into the swing of things, and act as a nice rehearsal.

For the rest of this post, I'll adopt a do/don't format.

Do: Try to bring the interviewee into the interview as early as possible. In other words, don't make them sit there and listen to the history of the company starting from when your founder was born. State a little bit about the job and try to get them talking. Assume they've done some research.

Do: Use the CV as a talking point. Most people will have projects listed on their CV. Try to get them to talk about their projects, focusing on what they did and their role in the project. Ask them about what worked, and what did not. Ask them what they learned.

Do: Give people a little time to compose themselves. Once shown to the interview room, I ask if they'd like tea or coffee. Regardless of the answer, I leave for just a few minutes (note: not 5 or 10 minutes, just 2).

Do: Keep your questions open-ended and allow the interviewee to come in as often and as frequently as possible.

Don't: Start with areas of discussion or topics that are likely to be controversial. Leave these towards the end when hopefully you've built up a rapport. Personally I avoid arguments – if someone states a fact which I am certain is incorrect, I just say 'are you sure about that?'.

Don't: Bombard them with fact-based questions like they are sitting an exam. Just drop these into the interview. Don't turn the interview into an ITIL Q & A.

Don't: Paint an overly rosy picture. True story: a colleague of mine here at Serio (years ago) got a job as an IT manager. He had the foresight to ask during the interview 'why is the current IT manager leaving' and was told 'he's started his own business' which seems fair enough. After starting my colleague found out the truth: the company was 2 years old and had burned the 3 previous IT managers. The first suffered a nervous breakdown in the office after 9 months (crying, shaking), the second was fired after 4 months, and the third had just got up from his desk one day and walked out never to return. When he found out, my colleague was, of course, deeply unimpressed.

Don't: Assume the qualifications and references are all genuine. I've been surprised by the extent to which false information is placed on CVs – check things out carefully.

Do: Be careful about 'Why?' and 'Why did you do that?' as it will unsettle some people – making them feel like 'they've put their foot in it'. If you can, preface such questions with 'that's interesting' to make people feel more at ease.

Do: Finish with 'Are there things you'd like to talk about we have not mentioned?' and/or 'Do you have any questions for me?'.

Don't: Be secretive. Someone has taken their time to see you, so tell them about the recruitment process. Don't assume the agency (if they have come through an agency) has told them anything.

Do: Ask relevant questions for the job (naturally), following the advice given above (although question is the wrong word, it's more like 'topics for discussion').

In future posts next week I'll suggest some questions topics for you to use for a few typical Helpdesk or Service Desk roles. 

Wednesday, 14 Feb 2007

This post will be about reports available in SerioReports. My last topic focused on Incident KPIs, this post will focus on reports that might be of interest to a service level manager.

At this point, I’ll try to define what we aim to achieve with service level management:

Maintain and improve the quality of IT Service Delivery by agreeing what our service levels should be, and then constantly monitoring and reporting on the service levels we are achieving. Where appropriate make recommendations for improvements in service delivery.

There is more to this than simply response and resolve times, important as they are, but I expect we will pick up service level management in greater detail in future posts.

Locating the Reports

First off, you will need SerioReports – this is where most of the packages reports are. SerioReports is a licensed product (so you need a license!) that you have to install on a computer somewhere. If you can’t find it, ask your Serio Administrator to install it for you.

A large number of the service level management (or SLA) reports are grouped under SLA Analysis, where you’ll find about 25. What I will do now is to pick some of the more interesting ones and talk about them.

Firstly I’ll start with overview reports.

SLA5 – This is a very simple report. Like most of these reports, it takes a start date, end date and an SLA, and then shows you (by priority) your Incident resolution within agreed SLA times. This report is sometimes printed to PDF and circulated to ‘casually interested’ parties, or pinned to notice boards.

SLA6 – This is another simple report. It reports on callbacks or responsiveness, and it is a fairly common industry measure of performance. It is the amount of time taken for the Service Desk or Helpdesk to respond to an Incident being reported – it is not an automated response. In Serio when logging a ticket, there is a checkbox that says ‘Customer needs a callback’ – this report uses Incidents you’ve flagged in this way.

Moving onto more detailed reports there is SLA4. This report uses a matrix to report on Incident resolution performance by Company. It would be used by service level managers to after checking overall performance to examine service levels obtained for each major Company or department that they support, to identify areas of poor performance to groups.

Monday, 12 Feb 2007

This post is to tie together all of my previous posts about Incident Management that have appeared over the past month or so.

One of the things I really like about the idea of a blog is its somewhat informal nature and the expectation it’s content may be a little eclectic at times. One of it’s disadvantages is that it can be difficult to follow a ‘thread’ of articles over time - hence wrap-up posts like this are useful in some cases.

I started off in January with this topic: Introducing Incident Management, which offered a definition and talked about the concepts of Ownership and Assignment.

Next I posted about The Incident Life Cycle – which pretty much did what it said in the title. It defined and explained the major steps in the Life Cycle, and talked about the ‘workflow position’ with a few examples of how to find and use that in the Serio tool.

I then went on to blog about Escalation in Incident Management. This post talked about Escalation – a term that means different things to different people, and offered some examples.

I followed this by talking about Success Factors in Incident Management and listed some things to help you develop a better Incident Management process – things like culture, and giving yourself some attainable objectives in the opening month.

Next up: Major Incidents. Really this is worth quite a few articles, but I included it here because it seems to be something that seems ‘mysterious’ to some people. I defined a Major Incident, and suggested a (by no means) comprehensive list of things that might be part of a Major Incident process – to give those who had asked about such a thing an idea of where to start.

This was followed by an article on Key Performance Indicators for Incident Management. The idea here was to help with the reporting and metrics side of things – again with those getting started and having to do this for the first time.

My colleague and blog Robin to my Batman :-) posted in this post and also here about some reports in Serio that could be used in Incident reporting. These articles were detailed and told you where to locate the reports – so as to remove any confusion at all. All you need to do is draw conclusions and make recommendations for improvement based on the data you see.

Finally, we all need to develop good habits – so I posted some.

A PS blog post was added after a commentator’s question about quality. Phew.

Friday, 09 Feb 2007

Emailer Steve responds to this Problem Management white paper and poses this question: what can you do if no Problems are actually raised by the Incident resolution teams? Steve specifically mentions a period of over 4 months with no Problems at all being raised. Steve’s teams are handling 2500 or so Incidents per week, has 30 or so staff on the Service Desk and a further 70 working in specialist teams like I’ve referred to in this post on escalation.

Firstly, for new readers you can find a definition of a Problem here. It’s one of the outputs from Incident Management into Problem Management.

My editorial brief for this blog is to provide practical suggestions and to avoid unnecessary jargon – so that’s what I’ll try to do. What follows is in no particular order, it’s just some ideas to check off.

  1. Shared vision and understanding. If Steve has a vision of how things should work, and the use of Problem Management is a part of that, that vision needs to be shared between the managers and the IT service delivery staff. This can be quite hard, as sometimes the focus of more technically minded staff is not on service management. One of the things I see time and again is where the management view is one thing, and IT staff another. Managers do sometimes have a tendency to assume that staff know what is expected, when the staff either don’t, take a contrary view, or simply don't care.

Having stated the problem, I’ll try to suggest some solutions. One approach is to have regular IT service meetings where issues like this can be raised, managers can be persuasive, and staff can air their views. In my experience a lot of organisations simply don’t have meetings like this. It’s beyond the scope of this post to write about how such meetings should be conducted, but avoid having groups that are too large (some staffers will feel intimidated from contributing), make sure you have a chairman or woman who knows what they are doing, have an agenda, and produce action points afterwards.

  1. Make sure your procedures are written down in a concise, usable form. This comes back to my point about managers sometimes misleading themselves about the understanding that others have. However, in doing this you want a concise document that is useful and is not like a legal document. At Serio we refer to this as an Operation Manual and have a template/example for customers to use.

  2. Ensure that your staff understand what a Problem is, as the actual meaning can be quite elusive to some people. The best way to illustrate this is to pick one or two Incidents, and explain in your meetings why these are Problems (be careful not to be seen to be critical in the early stages).

  3. You should have a Problem Manager, and everyone should know who that person is. This person owns all the Problems, and the Problem Management process. The Problem Manager should have a team. In all probability, this will not be a full time team, but will be drawn from the existing service teams so that staff have dual roles (we work on both Incidents and Problems). Pay particular attention to the composition of the Problem Management Team – choose from a good cross section of disciplines and Incident Management groups. These people can acts as advocates of Problem Management for you and as ‘spotters’ in the Incident teams.

  4. Make sure it’s really, really easy for staff to flag Problems (and I mean REALLY easy). For instance, you could define a Cause Category of ‘Unknown’ which is applied at resolution time and this could be taken by the Problem Manager as someone saying ‘this is a potential Problem record’. This would be my advice for Serio users. Make sure that whatever Cause code you use, it's meaning ('hey this is a potential Problem!') is understood widely.

  5. Make sure your ITSM tool is up to scratch. For instance, raising a Problem ticket from an Incident should be easy. Linking multiple Incidents to a Problem should also be easy.

  6. Enhance the role of the Service Desk in Problem Management by getting them involved in reviewing Incidents. Make a specific responsibility one of scanning for Problems.

  7. Ensure that the Incident Manager and Problem Manager have a good working relationship on all levels. If they sit next to or within earshot of one another you might find that this helps the Problem Manager be aware of issues coming through in Incident Management process that he has an interest in.

  8. Whatever your procedures are, make sure the Problem Management process does not slow in any way the ability of Agent to close Incident tickets – they might want to do this quickly for all sorts of reasons (see yesterday’s post). They should be able to flag Incidents quickly for consideration by the Problem team, and it should not slow their ability to close the ticket.

Wednesday, 07 Feb 2007

Emailer and sometime commentator Jim asks about quality and his service teams. Specifically, he has second and third line support teams that both have what Jim calls an alarming tendency to mark Incidents as ‘resolved’ when they are not resolved at all. Jim asks if I have any practical suggestions, other than just shouting at people.

Firstly, remember I’ve blogged quite a bit recently about Incident Management.

Coming to Jim’s question, I do have some things that can be considered in circumstances like these. The points for consideration are in no particular order.

  1. Ask if you have created or added to the problem yourself by an inappropriate use of metrics. By this I mean you’ve told your team members that you are looking very closely at statistics for who is resolving Incidents, or that you’ve told the team you are looking at timeliness of resolution on an Agent-by-Agent basis (or worse, both). Now when I say ‘you’ve told’ I don’t necessarily mean you stood on a chair and said that's what you were going to do - remember that people gossip and talk informally amongst themselves. So, staff can get this impression by you simply mentioning the statistics to individuals in a negative way such as ‘why have you only resolved 10 tickets this week?’.

Metrics such as Agent performance need to be used with a little bit of caution, as they often don’t tell the whole story. A colleague of mine here at Serio tells a tale from when he worked as a programmer on a large team fixing bugs in an insurance firm. Bugs were logged, assigned to programmers, and then fixed. A new development team manger was recruited and after two weeks the new manager issued a memo to all development and testing staff complaining about ‘poor numbers’ and proceeded to name an engineer. The manager’s mistake was this: the ‘poor numbers’ guy was the brightest and best in the group, and handled some of the toughest jobs that came into the group – therefore his ‘fix rate’ was much lower.

Does this all mean that such statistics should be avoided? Absolutely not. It simply means that they should be used with caution, and you need to be aware of how your staff might regard the use of such statistics.

One positive step is to make sure that you have statistics for the numbers of Incident re-opened, focusing on who the original resolving Agent was, and to use this in conjunction with other Agent performance stats.

  1. Tell your teams that you perceive a problem. Try to bring them onside, and appreciate the need for quality rather than premature fixes. Try to understand how your team members see their role and what pressures they feel.

  2. Consider the roles of Team Leaders. Ask them to review some or all of the Incidents being resolved by their team.

  3. Introduce a 2-stage completion process if you don’t have one. By this I mean that when service teams resolve Incidents, they put the Incident to ‘Pending Complete’ and re-assign back to the Helpdesk or Service Desk. What then happens is we check with the customer proactively to make sure that the fault is resolved.

  4. Consider the possibilities of skills gaps within your teams, particularly the second and third-line support. Examine Incidents that have been re-opened for clues as to why this problem is happening.

  5. Make sure that the Helpdesk or Service Desk is actually re-opening Incidents, rather than logging new ones. I’m saying this but I know it’s hard to do 100% right. I’ve blogged before about having a call handling script – amend your script to ask if Incident have been reported previously, and give staff guidance on when it is right to re-open.

By the way, if you’ve read this post and are thinking ‘this does not affect me’ I have to ask how do you know, and are you sure?