Serio Blog

Friday, 16 Feb 2007

In an earlier post I touched on the subject of interviewing, in response to a commentator's question. This turned out to be a really popular article, with a significant number of unique visitors and email correspondence generated. Thank you all for your comments.

I'm going to revisit this subject today. Our internet logs tells us that quite a lot of people are searching for information such as 'interview questions' or 'how to conduct a service desk interview' and I want to write an article that might help. However before that I want to talk about spambots and recruitment firms.

One thing the earlier post did was generate, after a period of about a week, a very large number of calls from recruitment firms – all asking for either 'Pete' or 'George', all adamant we were looking for a new Service Desk Manager, and all very, very persistent. At first we didn't understand this, but then someone explained it to us. There are a significant number of what I will call spambots that are looking for vacancies on websites, and clearly they've been crawling ours (bots are programs that read all of the pages in a website). These bots are clearly trying to read the text, and then sell the 'opportunity' to recruitment firms. However in this case the bots seem to have misread the post.

OK for the benefit of bots: please do not call us, this post does not mean we have a vacancy smiley

Now on to the subject of interviewing. It's always surprised me that most people who are interviewing have had no training, and very little guidance. Presumably you are meant to learn the technique from the times you've served as the interviewee. What I'd say is that if it's your first interview, don't be afraid to role-play with a colleague (it's best done with a peer). Get your colleague to sit with you playing the role of a candidate, and afterwards discuss with them how the interview went. This will help you get into the swing of things, and act as a nice rehearsal.

For the rest of this post, I'll adopt a do/don't format.

Do: Try to bring the interviewee into the interview as early as possible. In other words, don't make them sit there and listen to the history of the company starting from when your founder was born. State a little bit about the job and try to get them talking. Assume they've done some research.

Do: Use the CV as a talking point. Most people will have projects listed on their CV. Try to get them to talk about their projects, focusing on what they did and their role in the project. Ask them about what worked, and what did not. Ask them what they learned.

Do: Give people a little time to compose themselves. Once shown to the interview room, I ask if they'd like tea or coffee. Regardless of the answer, I leave for just a few minutes (note: not 5 or 10 minutes, just 2).

Do: Keep your questions open-ended and allow the interviewee to come in as often and as frequently as possible.

Don't: Start with areas of discussion or topics that are likely to be controversial. Leave these towards the end when hopefully you've built up a rapport. Personally I avoid arguments – if someone states a fact which I am certain is incorrect, I just say 'are you sure about that?'.

Don't: Bombard them with fact-based questions like they are sitting an exam. Just drop these into the interview. Don't turn the interview into an ITIL Q & A.

Don't: Paint an overly rosy picture. True story: a colleague of mine here at Serio (years ago) got a job as an IT manager. He had the foresight to ask during the interview 'why is the current IT manager leaving' and was told 'he's started his own business' which seems fair enough. After starting my colleague found out the truth: the company was 2 years old and had burned the 3 previous IT managers. The first suffered a nervous breakdown in the office after 9 months (crying, shaking), the second was fired after 4 months, and the third had just got up from his desk one day and walked out never to return. When he found out, my colleague was, of course, deeply unimpressed.

Don't: Assume the qualifications and references are all genuine. I've been surprised by the extent to which false information is placed on CVs – check things out carefully.

Do: Be careful about 'Why?' and 'Why did you do that?' as it will unsettle some people – making them feel like 'they've put their foot in it'. If you can, preface such questions with 'that's interesting' to make people feel more at ease.

Do: Finish with 'Are there things you'd like to talk about we have not mentioned?' and/or 'Do you have any questions for me?'.

Don't: Be secretive. Someone has taken their time to see you, so tell them about the recruitment process. Don't assume the agency (if they have come through an agency) has told them anything.

Do: Ask relevant questions for the job (naturally), following the advice given above (although question is the wrong word, it's more like 'topics for discussion').

In future posts next week I'll suggest some questions topics for you to use for a few typical Helpdesk or Service Desk roles. 

Wednesday, 14 Feb 2007

This post will be about reports available in SerioReports. My last topic focused on Incident KPIs, this post will focus on reports that might be of interest to a service level manager.

At this point, I’ll try to define what we aim to achieve with service level management:

Maintain and improve the quality of IT Service Delivery by agreeing what our service levels should be, and then constantly monitoring and reporting on the service levels we are achieving. Where appropriate make recommendations for improvements in service delivery.

There is more to this than simply response and resolve times, important as they are, but I expect we will pick up service level management in greater detail in future posts.

Locating the Reports

First off, you will need SerioReports – this is where most of the packages reports are. SerioReports is a licensed product (so you need a license!) that you have to install on a computer somewhere. If you can’t find it, ask your Serio Administrator to install it for you.

A large number of the service level management (or SLA) reports are grouped under SLA Analysis, where you’ll find about 25. What I will do now is to pick some of the more interesting ones and talk about them.

Firstly I’ll start with overview reports.

SLA5 – This is a very simple report. Like most of these reports, it takes a start date, end date and an SLA, and then shows you (by priority) your Incident resolution within agreed SLA times. This report is sometimes printed to PDF and circulated to ‘casually interested’ parties, or pinned to notice boards.

SLA6 – This is another simple report. It reports on callbacks or responsiveness, and it is a fairly common industry measure of performance. It is the amount of time taken for the Service Desk or Helpdesk to respond to an Incident being reported – it is not an automated response. In Serio when logging a ticket, there is a checkbox that says ‘Customer needs a callback’ – this report uses Incidents you’ve flagged in this way.

Moving onto more detailed reports there is SLA4. This report uses a matrix to report on Incident resolution performance by Company. It would be used by service level managers to after checking overall performance to examine service levels obtained for each major Company or department that they support, to identify areas of poor performance to groups.

Monday, 12 Feb 2007

This post is to tie together all of my previous posts about Incident Management that have appeared over the past month or so.

One of the things I really like about the idea of a blog is its somewhat informal nature and the expectation it’s content may be a little eclectic at times. One of it’s disadvantages is that it can be difficult to follow a ‘thread’ of articles over time - hence wrap-up posts like this are useful in some cases.

I started off in January with this topic: Introducing Incident Management, which offered a definition and talked about the concepts of Ownership and Assignment.

Next I posted about The Incident Life Cycle – which pretty much did what it said in the title. It defined and explained the major steps in the Life Cycle, and talked about the ‘workflow position’ with a few examples of how to find and use that in the Serio tool.

I then went on to blog about Escalation in Incident Management. This post talked about Escalation – a term that means different things to different people, and offered some examples.

I followed this by talking about Success Factors in Incident Management and listed some things to help you develop a better Incident Management process – things like culture, and giving yourself some attainable objectives in the opening month.

Next up: Major Incidents. Really this is worth quite a few articles, but I included it here because it seems to be something that seems ‘mysterious’ to some people. I defined a Major Incident, and suggested a (by no means) comprehensive list of things that might be part of a Major Incident process – to give those who had asked about such a thing an idea of where to start.

This was followed by an article on Key Performance Indicators for Incident Management. The idea here was to help with the reporting and metrics side of things – again with those getting started and having to do this for the first time.

My colleague and blog Robin to my Batman :-) posted in this post and also here about some reports in Serio that could be used in Incident reporting. These articles were detailed and told you where to locate the reports – so as to remove any confusion at all. All you need to do is draw conclusions and make recommendations for improvement based on the data you see.

Finally, we all need to develop good habits – so I posted some.

A PS blog post was added after a commentator’s question about quality. Phew.

Friday, 09 Feb 2007

Emailer Steve responds to this Problem Management white paper and poses this question: what can you do if no Problems are actually raised by the Incident resolution teams? Steve specifically mentions a period of over 4 months with no Problems at all being raised. Steve’s teams are handling 2500 or so Incidents per week, has 30 or so staff on the Service Desk and a further 70 working in specialist teams like I’ve referred to in this post on escalation.

Firstly, for new readers you can find a definition of a Problem here. It’s one of the outputs from Incident Management into Problem Management.

My editorial brief for this blog is to provide practical suggestions and to avoid unnecessary jargon – so that’s what I’ll try to do. What follows is in no particular order, it’s just some ideas to check off.

  1. Shared vision and understanding. If Steve has a vision of how things should work, and the use of Problem Management is a part of that, that vision needs to be shared between the managers and the IT service delivery staff. This can be quite hard, as sometimes the focus of more technically minded staff is not on service management. One of the things I see time and again is where the management view is one thing, and IT staff another. Managers do sometimes have a tendency to assume that staff know what is expected, when the staff either don’t, take a contrary view, or simply don't care.

Having stated the problem, I’ll try to suggest some solutions. One approach is to have regular IT service meetings where issues like this can be raised, managers can be persuasive, and staff can air their views. In my experience a lot of organisations simply don’t have meetings like this. It’s beyond the scope of this post to write about how such meetings should be conducted, but avoid having groups that are too large (some staffers will feel intimidated from contributing), make sure you have a chairman or woman who knows what they are doing, have an agenda, and produce action points afterwards.

  1. Make sure your procedures are written down in a concise, usable form. This comes back to my point about managers sometimes misleading themselves about the understanding that others have. However, in doing this you want a concise document that is useful and is not like a legal document. At Serio we refer to this as an Operation Manual and have a template/example for customers to use.

  2. Ensure that your staff understand what a Problem is, as the actual meaning can be quite elusive to some people. The best way to illustrate this is to pick one or two Incidents, and explain in your meetings why these are Problems (be careful not to be seen to be critical in the early stages).

  3. You should have a Problem Manager, and everyone should know who that person is. This person owns all the Problems, and the Problem Management process. The Problem Manager should have a team. In all probability, this will not be a full time team, but will be drawn from the existing service teams so that staff have dual roles (we work on both Incidents and Problems). Pay particular attention to the composition of the Problem Management Team – choose from a good cross section of disciplines and Incident Management groups. These people can acts as advocates of Problem Management for you and as ‘spotters’ in the Incident teams.

  4. Make sure it’s really, really easy for staff to flag Problems (and I mean REALLY easy). For instance, you could define a Cause Category of ‘Unknown’ which is applied at resolution time and this could be taken by the Problem Manager as someone saying ‘this is a potential Problem record’. This would be my advice for Serio users. Make sure that whatever Cause code you use, it's meaning ('hey this is a potential Problem!') is understood widely.

  5. Make sure your ITSM tool is up to scratch. For instance, raising a Problem ticket from an Incident should be easy. Linking multiple Incidents to a Problem should also be easy.

  6. Enhance the role of the Service Desk in Problem Management by getting them involved in reviewing Incidents. Make a specific responsibility one of scanning for Problems.

  7. Ensure that the Incident Manager and Problem Manager have a good working relationship on all levels. If they sit next to or within earshot of one another you might find that this helps the Problem Manager be aware of issues coming through in Incident Management process that he has an interest in.

  8. Whatever your procedures are, make sure the Problem Management process does not slow in any way the ability of Agent to close Incident tickets – they might want to do this quickly for all sorts of reasons (see yesterday’s post). They should be able to flag Incidents quickly for consideration by the Problem team, and it should not slow their ability to close the ticket.

Wednesday, 07 Feb 2007

Emailer and sometime commentator Jim asks about quality and his service teams. Specifically, he has second and third line support teams that both have what Jim calls an alarming tendency to mark Incidents as ‘resolved’ when they are not resolved at all. Jim asks if I have any practical suggestions, other than just shouting at people.

Firstly, remember I’ve blogged quite a bit recently about Incident Management.

Coming to Jim’s question, I do have some things that can be considered in circumstances like these. The points for consideration are in no particular order.

  1. Ask if you have created or added to the problem yourself by an inappropriate use of metrics. By this I mean you’ve told your team members that you are looking very closely at statistics for who is resolving Incidents, or that you’ve told the team you are looking at timeliness of resolution on an Agent-by-Agent basis (or worse, both). Now when I say ‘you’ve told’ I don’t necessarily mean you stood on a chair and said that's what you were going to do - remember that people gossip and talk informally amongst themselves. So, staff can get this impression by you simply mentioning the statistics to individuals in a negative way such as ‘why have you only resolved 10 tickets this week?’.

Metrics such as Agent performance need to be used with a little bit of caution, as they often don’t tell the whole story. A colleague of mine here at Serio tells a tale from when he worked as a programmer on a large team fixing bugs in an insurance firm. Bugs were logged, assigned to programmers, and then fixed. A new development team manger was recruited and after two weeks the new manager issued a memo to all development and testing staff complaining about ‘poor numbers’ and proceeded to name an engineer. The manager’s mistake was this: the ‘poor numbers’ guy was the brightest and best in the group, and handled some of the toughest jobs that came into the group – therefore his ‘fix rate’ was much lower.

Does this all mean that such statistics should be avoided? Absolutely not. It simply means that they should be used with caution, and you need to be aware of how your staff might regard the use of such statistics.

One positive step is to make sure that you have statistics for the numbers of Incident re-opened, focusing on who the original resolving Agent was, and to use this in conjunction with other Agent performance stats.

  1. Tell your teams that you perceive a problem. Try to bring them onside, and appreciate the need for quality rather than premature fixes. Try to understand how your team members see their role and what pressures they feel.

  2. Consider the roles of Team Leaders. Ask them to review some or all of the Incidents being resolved by their team.

  3. Introduce a 2-stage completion process if you don’t have one. By this I mean that when service teams resolve Incidents, they put the Incident to ‘Pending Complete’ and re-assign back to the Helpdesk or Service Desk. What then happens is we check with the customer proactively to make sure that the fault is resolved.

  4. Consider the possibilities of skills gaps within your teams, particularly the second and third-line support. Examine Incidents that have been re-opened for clues as to why this problem is happening.

  5. Make sure that the Helpdesk or Service Desk is actually re-opening Incidents, rather than logging new ones. I’m saying this but I know it’s hard to do 100% right. I’ve blogged before about having a call handling script – amend your script to ask if Incident have been reported previously, and give staff guidance on when it is right to re-open.

By the way, if you’ve read this post and are thinking ‘this does not affect me’ I have to ask how do you know, and are you sure? 

Monday, 05 Feb 2007

One of the tasks that many Serio Administrators can find a little daunting is preparing categorisations for Incidents, Problems and Changes. What I’m referring to us populating the data fields that help us to manage tickets.

I’m going to list some tips to help those setting about this for the first time, or those reviewing their categories.

  • Remember that someone will have to use this information during Issue logging, so make the number of Problem Area Categories that you have available during Issue logging less than 12 or so if you can. Then try to keep the number of Problem Areas within each Category less than 12 as well. Doing this gives a reasonable list for helpdesk and service desk staff to work with.

  • It’s OK (and indeed desirable) to have a small number of Issue Types. Remember these just record the type of ticket – for Incidents, you might have Fault, Query, Work Request and so on.

  • Try to create categories and data that is unambiguous. In an ideal situation, there will be an obvious best choice for most Issues being logged. Overlap in Problem Areas can cause the same type of fault to be classified in different ways.

  • The data you are creating will be used in Incident, Problem and Change reporting.

  • If you are starting from scratch, remember to examine any legacy data you have.

  • Also if you are starting from scratch, don’t be afraid to prepare your data away from the tool itself. Sometimes it’s useful simply to write your data down in a familiar tool such as a spreadsheet where your focus will be on the data rather than the software ITSM tool. However, if you do this make sure you are familiar with the structure of the data the tool requires.

  • It’s a process. Therefore after setting-up your data see how it’s working by examining tickets you’ve logged. Don’t be afraid to remove data that is not being used.

  • Remember that Serio allows you to have different classification data for Incidents, Problems and Changes.

  • Finally, have an ‘other/unclassified’ category. This will help for odd types of infrequently occurring Incidents – but make sure that this is not overused.



Friday, 02 Feb 2007

Over the past two weeks or so I have been blogging about Incident Management, and in particular how it is described in the ITIL Service Support book. At a later date I’ll draw these posts together, but for now I’ll explain why I started this thread.

My motive was to try to illustrate that it is quite straight-forward, and to try to de-mystify the subject for readers. It was also to point out that, valuable as it is, it’s not a silver bullet and nor is it that far removed from what properly managed IT service Helpdesk and Service Desks do anyway. Hopefully from my posts it is clear that ITIL Incident Management in non-prescriptive – it doesn’t tell you what to do. Instead, it offers a framework you can use for your own organisation and circumstances (therein lies both a strength [flexible enough for organisations of different types and sizes] and a weakness [insufficient guidance], depending on your point of view).

What I want to do is to write about some of the habits I see being adopted by the successful 'top 20%' of Helpdesks and Service Desks I have encountered. They are in no particular order, and the list is not definitive (in fact, I may return to this later). However, you’ll be able to see how you compare, and if you’ve others to add use the comments field. You might also want to have a look at my Success Factors in Incident Management post.

Habit: There is a good team structure, for instance Service Desk, Second Line and so on. Each Team has a Team Leader.

Habit: The overall Incident Management process is written down, with a clear focus on explaining what the responsibilities of different team members is. For example, we’d try to write down some of the responsibilities of our Team Leaders.

Habit: There is an Incident Manager whom everyone can identify. Everyone understands his or her responsibilities because these are written down.

Habit: The Incident Manager produces reports and metrics for the Incident Management team.

Habit: There is a constant drive to develop and maintain a Knowledgebase. There is a nominated Knowledgebase editor to whom suggestions can be made about new articles, and which the editor acts quickly – checking the articles for relevance and accuracy.

Habit: The quality of Incident records is seen as important. Staff make an effort to ensure proper classification and recording.

Habit: Strong focus on the central role of the Helpdesk or Service Desk in communication, with an emphasis on practicalities. For example, ensuring that ownership is maintained with the Service Desk even when Incidents are assigned to support teams, and then maintaining an involvement (particularly in terms of customer communication) with those Incidents by the Service Desk staff.

Habit: Careful use of status value to deliver a Workflow Position.

Habit: They make time for regular weekly team meetings for those most directly involved in Incident Management. These meetings follow a standard agenda, but allow flexibility for different issues to be raised. The meetings vary the chairman or woman, but the chair has clear guidelines on how to conduct the meeting.

Thursday, 01 Feb 2007

This is just to round-off yesterdays post about KPIs in Incident Management, and where you can find them in Serio.

Resolutions by the Service Desk or Helpdesk

This refers to the resolutions achieved by the Helpdesk/Service Desk. You’ll find this data available as a column in the First Time Fix report AGT14.

Percentages of Incidents Handled within SLA Target

Most the SLA based reports are clustered under ‘SLA Analysis’ – there are around 25 in all. Picking a few at random:

SLA5 – A nice, simple report which shows the percentage of resolutions on time, broken down by Priority.

SLA4 – A more detailed account, broken down by Company, of SLA resolutions on time.

SLA9 – Shows your response performance, again organised by Priority.

SLA10 – This is an interesting but complex report in graph form. It shows your SLA resolutions on time on a month-by-month basis. It also allows you to add your targets (for instance, 90% on time) onto the same graph for direct comparison, and also trend analysis.

Spread of Resolution Time

For this, see report ‘SLA Resolution Time Profile’ SLA12 and SLA12a. This show you in a convenient graph form how long Incidents are taking to resolve. The data is presented in a useful ‘banded’ form what the spread of resolution times is, and allows a lot of control over how thick or thin the bands actually are.

Wednesday, 31 Jan 2007

This blog entry is a follow up to our previous post about KPIs for Incident Management. The subject of this post is where these reports about KPIs can be located within the Serio tool.

You’ll need access to SerioReports. Remember this is a part of the tool you need to install – if you can’t see it you should install it (provided you have sufficient licenses to do so). Login to SerioReports, and open a Reports Explorer from the File menu.

Incident Counts

Theses are mainly clustered under the heading 'Logged'. These reports focus on Inputs (as defined in our Service Desk/Helpdesk Metrics white paper), and are both graphical and text based. I’ll pick some out and talk about them individually.

IL17 – Breaks down Incidents by Problem Area Category and Problem Area, with a percentage for each. Useful for understanding the spread of your Incidents.

IL7 – A useful grid that links up the Type of Incident (Fault, Job Request etc) with the Category (Printer, Spreadsheet etc).

IL14 – This report tells you when Incidents are logged during the data. Usually you’ll see two ‘bell’ curves – one in the morning, and one in the early afternoon, as this is typically when most Incidents are reported.

IL22 – This is a graph that shows Incidents by Problem Area. The most used categories are at the top, and this report is useful in weeding out unused Problem Area Categories from your system.

There are around 40 or so in this group, each offering different ways of looking at inputs.

You’ll also find some interesting graphs within SerioClient, under Tools/Performance. See ‘Days logged for Active Incidents’ and ‘Incidents logged and Resolved’. This later report shows you a week on week view of both tickets logged and tickets resolved – which hopefully (kind-of) match up over the piece. If not, maybe read this about backlogs.

First Time Fixes (FTF)

Within SerioReports, see the ‘First Time Fix’ report AGT14. This is grouped under ‘Agent Performance’. This report has recently been upgraded (about 2 months ago) and is now excellent (though I say so myself as the author of the report). It shows overall FTF, and FTF broken down by individual Agents and Teams plus other good stuff. If you want to get the latest version of SerioReports you will need to be using Serio 4.6 or later.

I’ll complete this blog entry tomorrow by looking at the remaining KPIs.

Monday, 29 Jan 2007

Regular readers will know that I have been posting recently about Incident Management, the last of which was posted here (there are others also).

This post will cover the subject of KPIs for Incident Management, and offer some practical suggestions for you. I’m going to keep this post general, and probably write a Serio-specific post later that tells you where the reports are (this data is available however from SerioReports).

What follows is not a definitive list – nor is the 'best' or 'only' list of KPIs. These are just some suggestions for your own Incident reporting repertoire, and is targeted firmly at Incident Managers who need to prepare management reports.

Incident counts

The total number of Incidents logged. You can cut this with a month-on-month trend going back over the previous quarter, or break it down by Category, or Priority, or Impact. What you will be interested in showing is the number (is it up or down, or constant?) and how severe the Incidents have been.

First time fixes

I’ve written about this a lot before, but for the latecomers it is simply a measure of how many customers reporting Incidents get an immediate resolution to the problem – before the call ends. This statistic is telephone based – it has no meaning when Incidents are logged via a web portal, and almost no meaning with Incidents reported by email. However, as the telephone continues to be an important medium for the Helpdesk of Service Desk (the percentage still seems to be over 50%) this continues to be an important statistic.

Resolutions by the Helpdesk or Service Desk

Whereas first time fixes relate to immediate resolution, this KPI simply refers to resolutions made by the Helpdesk/Service Desk without assignment to specialist teams. If this is a high (or rising) figure it suggests a good degree of competence within the group.

Percentage of Incidents handled within their SLA target

Whilst this falls into the remit of the Service Level Manager, it’s still a useful KPI for Incident Management. Typically you’ll be looking at the speed of response, and of resolution. Like the Incident Counts figures, it’s sometimes useful to break this figure down into different groups – such as Impact or Category.

Spread of Resolution Time

This is where you take Incidents and examine how long the resolution time, examining the mean resolution time, and deviations from the mean. Almost always it’s better to use a graph or histogram to express this, and again is a useful indicator of quality.