Serio Blog

Monday, 05 Feb 2007

One of the tasks that many Serio Administrators can find a little daunting is preparing categorisations for Incidents, Problems and Changes. What I’m referring to us populating the data fields that help us to manage tickets.

I’m going to list some tips to help those setting about this for the first time, or those reviewing their categories.

  • Remember that someone will have to use this information during Issue logging, so make the number of Problem Area Categories that you have available during Issue logging less than 12 or so if you can. Then try to keep the number of Problem Areas within each Category less than 12 as well. Doing this gives a reasonable list for helpdesk and service desk staff to work with.

  • It’s OK (and indeed desirable) to have a small number of Issue Types. Remember these just record the type of ticket – for Incidents, you might have Fault, Query, Work Request and so on.

  • Try to create categories and data that is unambiguous. In an ideal situation, there will be an obvious best choice for most Issues being logged. Overlap in Problem Areas can cause the same type of fault to be classified in different ways.

  • The data you are creating will be used in Incident, Problem and Change reporting.

  • If you are starting from scratch, remember to examine any legacy data you have.

  • Also if you are starting from scratch, don’t be afraid to prepare your data away from the tool itself. Sometimes it’s useful simply to write your data down in a familiar tool such as a spreadsheet where your focus will be on the data rather than the software ITSM tool. However, if you do this make sure you are familiar with the structure of the data the tool requires.

  • It’s a process. Therefore after setting-up your data see how it’s working by examining tickets you’ve logged. Don’t be afraid to remove data that is not being used.

  • Remember that Serio allows you to have different classification data for Incidents, Problems and Changes.

  • Finally, have an ‘other/unclassified’ category. This will help for odd types of infrequently occurring Incidents – but make sure that this is not overused.



Friday, 02 Feb 2007

Over the past two weeks or so I have been blogging about Incident Management, and in particular how it is described in the ITIL Service Support book. At a later date I’ll draw these posts together, but for now I’ll explain why I started this thread.

My motive was to try to illustrate that it is quite straight-forward, and to try to de-mystify the subject for readers. It was also to point out that, valuable as it is, it’s not a silver bullet and nor is it that far removed from what properly managed IT service Helpdesk and Service Desks do anyway. Hopefully from my posts it is clear that ITIL Incident Management in non-prescriptive – it doesn’t tell you what to do. Instead, it offers a framework you can use for your own organisation and circumstances (therein lies both a strength [flexible enough for organisations of different types and sizes] and a weakness [insufficient guidance], depending on your point of view).

What I want to do is to write about some of the habits I see being adopted by the successful 'top 20%' of Helpdesks and Service Desks I have encountered. They are in no particular order, and the list is not definitive (in fact, I may return to this later). However, you’ll be able to see how you compare, and if you’ve others to add use the comments field. You might also want to have a look at my Success Factors in Incident Management post.

Habit: There is a good team structure, for instance Service Desk, Second Line and so on. Each Team has a Team Leader.

Habit: The overall Incident Management process is written down, with a clear focus on explaining what the responsibilities of different team members is. For example, we’d try to write down some of the responsibilities of our Team Leaders.

Habit: There is an Incident Manager whom everyone can identify. Everyone understands his or her responsibilities because these are written down.

Habit: The Incident Manager produces reports and metrics for the Incident Management team.

Habit: There is a constant drive to develop and maintain a Knowledgebase. There is a nominated Knowledgebase editor to whom suggestions can be made about new articles, and which the editor acts quickly – checking the articles for relevance and accuracy.

Habit: The quality of Incident records is seen as important. Staff make an effort to ensure proper classification and recording.

Habit: Strong focus on the central role of the Helpdesk or Service Desk in communication, with an emphasis on practicalities. For example, ensuring that ownership is maintained with the Service Desk even when Incidents are assigned to support teams, and then maintaining an involvement (particularly in terms of customer communication) with those Incidents by the Service Desk staff.

Habit: Careful use of status value to deliver a Workflow Position.

Habit: They make time for regular weekly team meetings for those most directly involved in Incident Management. These meetings follow a standard agenda, but allow flexibility for different issues to be raised. The meetings vary the chairman or woman, but the chair has clear guidelines on how to conduct the meeting.

Thursday, 01 Feb 2007

This is just to round-off yesterdays post about KPIs in Incident Management, and where you can find them in Serio.

Resolutions by the Service Desk or Helpdesk

This refers to the resolutions achieved by the Helpdesk/Service Desk. You’ll find this data available as a column in the First Time Fix report AGT14.

Percentages of Incidents Handled within SLA Target

Most the SLA based reports are clustered under ‘SLA Analysis’ – there are around 25 in all. Picking a few at random:

SLA5 – A nice, simple report which shows the percentage of resolutions on time, broken down by Priority.

SLA4 – A more detailed account, broken down by Company, of SLA resolutions on time.

SLA9 – Shows your response performance, again organised by Priority.

SLA10 – This is an interesting but complex report in graph form. It shows your SLA resolutions on time on a month-by-month basis. It also allows you to add your targets (for instance, 90% on time) onto the same graph for direct comparison, and also trend analysis.

Spread of Resolution Time

For this, see report ‘SLA Resolution Time Profile’ SLA12 and SLA12a. This show you in a convenient graph form how long Incidents are taking to resolve. The data is presented in a useful ‘banded’ form what the spread of resolution times is, and allows a lot of control over how thick or thin the bands actually are.

Wednesday, 31 Jan 2007

This blog entry is a follow up to our previous post about KPIs for Incident Management. The subject of this post is where these reports about KPIs can be located within the Serio tool.

You’ll need access to SerioReports. Remember this is a part of the tool you need to install – if you can’t see it you should install it (provided you have sufficient licenses to do so). Login to SerioReports, and open a Reports Explorer from the File menu.

Incident Counts

Theses are mainly clustered under the heading 'Logged'. These reports focus on Inputs (as defined in our Service Desk/Helpdesk Metrics white paper), and are both graphical and text based. I’ll pick some out and talk about them individually.

IL17 – Breaks down Incidents by Problem Area Category and Problem Area, with a percentage for each. Useful for understanding the spread of your Incidents.

IL7 – A useful grid that links up the Type of Incident (Fault, Job Request etc) with the Category (Printer, Spreadsheet etc).

IL14 – This report tells you when Incidents are logged during the data. Usually you’ll see two ‘bell’ curves – one in the morning, and one in the early afternoon, as this is typically when most Incidents are reported.

IL22 – This is a graph that shows Incidents by Problem Area. The most used categories are at the top, and this report is useful in weeding out unused Problem Area Categories from your system.

There are around 40 or so in this group, each offering different ways of looking at inputs.

You’ll also find some interesting graphs within SerioClient, under Tools/Performance. See ‘Days logged for Active Incidents’ and ‘Incidents logged and Resolved’. This later report shows you a week on week view of both tickets logged and tickets resolved – which hopefully (kind-of) match up over the piece. If not, maybe read this about backlogs.

First Time Fixes (FTF)

Within SerioReports, see the ‘First Time Fix’ report AGT14. This is grouped under ‘Agent Performance’. This report has recently been upgraded (about 2 months ago) and is now excellent (though I say so myself as the author of the report). It shows overall FTF, and FTF broken down by individual Agents and Teams plus other good stuff. If you want to get the latest version of SerioReports you will need to be using Serio 4.6 or later.

I’ll complete this blog entry tomorrow by looking at the remaining KPIs.

Monday, 29 Jan 2007

Regular readers will know that I have been posting recently about Incident Management, the last of which was posted here (there are others also).

This post will cover the subject of KPIs for Incident Management, and offer some practical suggestions for you. I’m going to keep this post general, and probably write a Serio-specific post later that tells you where the reports are (this data is available however from SerioReports).

What follows is not a definitive list – nor is the 'best' or 'only' list of KPIs. These are just some suggestions for your own Incident reporting repertoire, and is targeted firmly at Incident Managers who need to prepare management reports.

Incident counts

The total number of Incidents logged. You can cut this with a month-on-month trend going back over the previous quarter, or break it down by Category, or Priority, or Impact. What you will be interested in showing is the number (is it up or down, or constant?) and how severe the Incidents have been.

First time fixes

I’ve written about this a lot before, but for the latecomers it is simply a measure of how many customers reporting Incidents get an immediate resolution to the problem – before the call ends. This statistic is telephone based – it has no meaning when Incidents are logged via a web portal, and almost no meaning with Incidents reported by email. However, as the telephone continues to be an important medium for the Helpdesk of Service Desk (the percentage still seems to be over 50%) this continues to be an important statistic.

Resolutions by the Helpdesk or Service Desk

Whereas first time fixes relate to immediate resolution, this KPI simply refers to resolutions made by the Helpdesk/Service Desk without assignment to specialist teams. If this is a high (or rising) figure it suggests a good degree of competence within the group.

Percentage of Incidents handled within their SLA target

Whilst this falls into the remit of the Service Level Manager, it’s still a useful KPI for Incident Management. Typically you’ll be looking at the speed of response, and of resolution. Like the Incident Counts figures, it’s sometimes useful to break this figure down into different groups – such as Impact or Category.

Spread of Resolution Time

This is where you take Incidents and examine how long the resolution time, examining the mean resolution time, and deviations from the mean. Almost always it’s better to use a graph or histogram to express this, and again is a useful indicator of quality. 

Friday, 26 Jan 2007

I’ve been blogging previously about Incident Management, and no discussion about Incident Management would be complete without mentioning Major Incidents.

First of all, let me offer a definition: A Major Incident is any Incident that has a significant or substantial effect on part of all of the business.

Leaving aside the issue about VIPs (a stuck keyboard belonging to your Chief Executive causing panic on the help desk), we can say that Major Incidents usually affect significant numbers of employees, and involve important enterprise level services.

So how do we manage Major Incidents? The answer is ‘we do all the stuff that we normally do’ – plus we do some other things. So we make sure that we log the Incident properly, use the CMDB as required, and make the best initial assignment that we can. If you are a Serio user, presumably you’ve set up a Broadcast Alert. Broadcast Alerts were specifically designed for major Incidents, and can be used to send emails about important Incidents to lots of people. You might also use the Serio Text Message Gateway to send text (SMS) messages.

Coming back to the ‘other things’ I mentioned above, this is where your Incident Manager (you have one, right?) takes a lead. What follows could, with a little effort, form the basis of a Major Incident Procedure.

  1. If you can, make a rough estimate of how long the Incident will last, or more accurately, how long the missing service will be unavailable. You might be reluctant to do this, as people have a tendency to hold you to rough estimates at times of stress, but you should do it anyway.
  2. Inform the key stakeholders. By this I mean do more than send them an automated email, use the CMDB to identify the affected parties, and let them know about the Incident – don’t assume they know. Give them your estimate from 1. above. This way, they will know if it is worth starting manual procedures, and it will help them deal with their customers.
  3. If you are a Serio user, post the Incident to the Service Status website, as that is what it is for. Post updates on the Incident here during the day, your customers will appreciate it.
  4.  Inform your Problem Manager (you’ve got one, right?). I’ve blogged about Problem Management before here and here (and other places), and we have a Problem Management white paper on that subject for download.
  5. Once the Incident is resolved, perform a review. Analyse the Incident from different perspectives, which should include:
  • Could the Incident have been avoided in the first place?
  • What was the estimated cost to the business of the Incident?
  • How well did we perform in restoring the missing service to users?
  • Did we communicate effectively, both between ourselves and our customers?
  • How well did our internal documentation perform – for instance, our recovery documentation?
  1. Report your findings clearly with recommendations for the future.

Wednesday, 24 Jan 2007

This post will be on the topic of reports you can run in SerioReports – I am going to pick a few and talk about them in detail.

OK, so where is SerioReports? If you don’t have an icon for it on your desktop, remember that it’s an application and needs to be installed (though please remember that it is licensed, so you may need to purchases additional licenses before doing so). Please remember that SerioReports also has it’s own help file distributed with the product.

We’ve already got quite a bit to say about metrics and reporting. I have blogged about it here, and there is an excellent white paper by my colleague George Ritchie entitled ‘Service Desk Metrics – Getting Started’. If you have not read this white paper yet, and you are interested in reporting, I urge you to do so now.

What I will do in this post is to select a report that I find interesting, and to talk about it and explain about the data it presents and what it displays. I’ll return to this topic next time I post.

I’ll start with SLA Analysis reports – this is where most (but not all) of the service level related reports reside. There are approximately 22 of these – some graphical, some textual. If you can’t find these reports, then log-in to SerioReports, select ‘New Report Explorer’, and then in the tree presented expand the section ‘SLA Analysis’.

I’ll start with a simple graph-based report – SLA Resolutions/Company (SLA16). If you supply a start date, and end date, and the SLA on which the report should be run, you’ll get a histogram that shows you the following:

For each Company on whose behalf you have resolved Incidents, the percentage of resolutions on time. In order to help you make better sense of the percentages, the report also shows you the number of Incidents on which the percentage is based. This is important because you might have a Company that shows ‘0% on time’ – but for just one Incident resolved between your start date and end date.

So who might use SLA16 and why?

Most likely to run this report is a Service Level Manager, or whoever is responsible for overall IT service levels within your organisation (as George Ritchie would say, you’ve got someone responsible for that, right?). You’d run the report as a safety check, and to get behind overall SLA statistics, to make sure that for each Company or Department you serve the service levels are within acceptable limits, or as part of an attempt to identify bottlenecks for further investigation.

Please also remember you can save these reports to PDF easily. Simply print them using the PDF printer installed with SerioReports.

You can also save reports to your favourites list, which means you don't need to search for them next time you want to run them.

Monday, 22 Jan 2007

I’ve posted here, here and here some introductory information about Incident Management. This post will be on the subject of success, and success factors, if you are introducing Incident Management into your Helpdesk or Service Desk. I’m going to assume you are moving from a 'zero' position, starting to embrace ITSM.

  1. Culture. Culture is really important, and being specific I mean the culture of work and service present in your helpdesk or service desk. I’ll state straight away that ITSM is, in my view, not about ‘making the life of IT staff easier’ particularly at the start. The focus is on improved customer and business service, and about higher standards generally in IT service delivery.

As an example, I’ve heard objections to Incident Management such as ‘I don’t have time to log all Incidents’, ‘Choosing a priority and category takes too long, I just want to capture a description’ or ‘This data is of no use anyway’. I'd take these as signs of gentle resistance.

If you are reading this, and you are the IT manager, or Service Delivery Manager, you certainly have a significant role to play in changing this culture (though it’s beyond the scope of this post to discuss strategies for doing this).

  1. A Knowledgebase. I know I said I was discussing this from the perspective of the 'zero' position, but it’s usually possible to make a start on writing down your most common problems and their solutions, and in doing so you’ll make a significant contribution to productivity. There are also a number of very good commercially available sources of knowledgebase content you can use (though choose wisely, and don't overwhelm staff with irrelevant content).

Make development part of your Incident Management process by encouraging engineers to suggest articles. If you are a Serio user, you can include ‘nomination’ to knowledgebase content as part of your resolution process. 

  1. A Configuration Management Database (CMDB). Yes, I know – if you are in the 'zero' position you won’t have one. They really do help and are important, so view the period whilst your processes mature as an interim period until you have a CMDB at your disposal. Whilst you are in this interim period, sometimes using network-based tools such as the Serio Inventory Agent/Workstation explorer can help.
  2. You need an ITSM tool, and you need to have a reasonable idea of how to use it. I have seen people trying to use spreadsheets and it never works in my opinion.
  3. If you are a manager, set yourself some real, tangible, possible (i.e., deliverable) objectives for the first few months of your Incident Management process. It might be ‘reduce Incident resolution times by 20% over 3 months’ or more simply ‘reduce Incident numbers by 10%’ or ‘improve measured customer satisfaction over the period’. Whatever it is, set yourself some goals. 

Friday, 19 Jan 2007

This post is an addendum to my earlier post on The Incident Life Cycle.

That post discusses a status value called the that is ‘signpost about where we are with an Incident’. This post will talk about where that is in Serio, and how help desks and service desks can use it.

Firstly this is referred to in Serio as Agent Status. There are two types of status value: A and B. These are functionally equivalent – rather than giving you a single status value to use, we gave you two, called Agent Status A and Agent Status B.

In terms of managing Incidents, you can use Agent Status in the following ways.

You can display the Agent Status in your Incident list. Imagine that you had 10 Active Incidents. If you wanted to find out where you are in their resolution process without Agent Status, you’d have to examine the Actions on each one individually. With Agent Status, you might have something like:

  • In progress
  • Awaiting parts
  • On Hold
  • In progress
  • Awaiting Purchase Authorisation
  • With External Supplier
  • In Progress
  • Unstarted
  • Unstarted
  • On hold

You can use the Agent Status to select Incidents to work on by creating a simple Query. For example, ‘show me all On hold Incidents assigned to my Team’

You can access a status report quickly and easily through SerioClient. This shows a pie chart comprised on Agent Status data. Simply open a ‘Performance’ chapter and select ‘Agent Status A/B’ distribution.

Setting Agent Status is done through Actions, and is something your own Serio Administrator must configure. There are two basic approaches taken.

The first on these involves setting the Agent Status incidentally. For example, you might take an Action called ‘Assign to Widgets Inc’ where Widgets Inc is a maintenance supplier, Your Action, as part of this, might change the Agent Status value to ‘With External Supplier’.

The second way is more direct, by having Actions solely designed to change the Agent Status value and very little else. For example, ‘Place Incident On Hold’ might be an example Action that does exactly what it says.

For more information about Agent Status and Actions, consult the HowTo guide, the main resource file distributed with Serio products.

Tuesday, 16 Jan 2007

Commentator Rob asks for some suggestions for things to discuss in a forthcoming interview for a Service Desk Manager role. It was such an interesting topic I could not resist.

I recall an interview I had back in the 1990’s with a London bank, for the role of IT Service Delivery manager. It started off badly – I arrived promptly, but then the interviewer turned up 30 minutes later and did not apologise for his tardiness.

Almost immediately the interview turned into an ITIL/ITSM question and answer session, along these lines. My qualifications were on my CV, and the actual award documents were in my briefcase – he never asked to see them.

‘What is a CMDB’ he asked, starting almost as soon as I had sat down.

I gave what I though was a good answer in my own words, explaining the role it plays for activities such as Change Management. He then started to split hairs, and something became apparent: he had learned to recite a lot of the text from the Service Support book like a parrot, and allowed no interpretation except for his own very literal view.

I had just finished (successfully) a large ITSM project, but this hardly featured in the interview. I left pitying the poor person who would get the job.

To come back to the point of this post, and to Rob’s question, avoid this scenario at all costs. Check the qualifications of your candidates carefully, and then assume they understand the relationship between the CMDB and Change Management. Surely what you are interested in is how they can use this kind of knowledge to deliver business benefits.

I recall another interview at a now-defunct manufacturing company. The person doing the interview here was someone who turned out to be a very capable boss. At the end of the interview I asked what he had thought of some of my answers, to which he replied ‘I was much more interested in how your reasoning process worked, and how well you could communicate with me’.

Some of the things he asked in the interview were topics I’ve touched on in this blog in recent months. I can remember the questions clearly because after 2 years I moved to another post, and participated in the interviews for my replacement.

‘You’ve got 30 IT staff, and I’m going to tell you some are wonderful and some are.. not wonderful. How do you find out which is which?’

‘Once you’ve sorted the ‘not wonderful’ into a group, what do you do?’ (His favourite response to this was from a guy who said ‘make them walk the plank, of course’).

‘The expectations of the business are not being met, where do you start?’

‘We’ve got 1000 open tickets. What would your action plan be?’

‘Your service teams have a culture of blame and recrimination. Tell me would you would do over 6 months to improve this situation.’ (On this one, I can tell you he was not looking for anything along the lines of nights-out in the pub, or fatuous team-building exercises).

‘You have been asked to produce an IT management summary report for our esteemed proprietor. What would you include?’