Serio Blog

Thursday, 21 Feb 2008

It's easy to overlook training.

I can think of a few occasions in my career, particularly during spell I had of short term contract work, where I'd show up for work on a Monday and the Service Delivery Manager would seem quite surprised, usually muttering something along the lines of 'I thought you weren't due till next week'.

Out of a handful of contracts, the average training time I received was 4 hours (a morning usually) and included in that was the dull and often pointless company induction video or a presentation by the HR department. I was usually on the phones dealing with customers before the end of the day.

And it showed. On the whole I think I coped quite well (but then I would say that), however the fact is customers were aware they we speaking to 'the new guy' - and often took the time to explain things carefully to me.

There are many things wrong with this approach slap-dash approach. Here are two of them.

Firstly, it is bad for customers because it's hard for the new start to understand what the problem might be (and sometime understand the specialist terms they are using), it's impossible to resolve the Incident first time and often impossible to ask the right questions for 2nd line support. There's a negative effect on quality that ripples right through service delivery.

Secondly, the new start can't but help to have a poor initial impression of his new manager and employer. Chances are they'll say 'my boss is an idiot' if asked.

So, long before you recruit anyone you need a training and induction programme for the Helpdesk/Service Desk person you are recruiting.

The form this takes will vary. For instance:

  • A training course on the products you support, if you are supporting a single groups of products or services - including sending the new recruit on your customer-focussed training course (even if this is days rather than hours in length).
  • Whatever you do, don't just say 'have a play with Product X' if for whatever reason no training course is available. Instead, set measurable goals that will test learning and understanding.
  • If you are engaged in more general IT support, make sure the new recruit is aware of the services you are supporting and has access to your Service Catalog.
  • Don't expect the employee's call handling and customer handling skills to come complete and perfect. These skills can be honed, as I've mentioned here.

A really good idea is to consider role-plays before any customer exposure. Take 5 or 6 typical cases ranging from easy to difficult, go into another room, and make the calls. This will also help to test how well they can use your ITSM tool.

Most importantly, add training time to the budget allocated for the new employee. It will help you understand what you are losing if staff aren't sticking around for very long - something I'll write about next week. 

Tuesday, 19 Feb 2008

Don't Oversell Availability, warns Peter Warren.

You've maybe never heard of S3, but the chances are you've used a website that uses it either wholly or in part in the last month.

S3 is Simple Storage Service - web content hosting on steriods, and is provided by Amazon. It is used by services like the excellent Twitter, Smugmug and Pownce, and thousands more. It's key selling points are its low cost and high availability - and at the end of last week it suffered a fairly substantial amount of downtime.

As a user myself, I've been surprised by the reaction of the S3 user community who view this like it is the end of the on-line world as we know it. In fact, the service was down for less than 3 hours.

Many users seem to have taken the marketing speak about "ensuring that the data will always be available when you need it" and the reassurances about redundant copies of data on different servers (the much hyped 'cloud') at face value - and have taken 100% uptime as the minimum that they should expect.

Personally I always view these claims in the same way as claims like 'this ship is unsinkable' - it always seems companies lack the imagination to understand where the critical weak-point is. In this case, S3 may (according to some rumours) have suffered a good old-fashioned Denial of Service (DoS) Attack - its authentication server got a gazillion requests and could not keep up. The 3 hour downtime was how long it took Amazon to create extra capacity.

My complaint is really about the way Amazon dealt with this, and I think this might be what is at the root of some customer ill-feeling (some customers seem to be downright unreasonable though).

I got a call in the small hours of the morning to say our website was down. Working from home it was clear that some important files hosted on S3 were inaccessible, but there was nothing from their Helpdesk that I could find on the support site to say either that Amazon knew something was up, or if they did know when it would be back.

This meant I had little of value to say to our own customers.

It took a note on the user forum before we got any word from Amazon. Frankly that's not good enough.

Many users are now asking for service status information. I hope they provide this soon.

I guess the moral of the story is: don't oversell availability, practice your customer response for when the inevitable unavailability occurs, and be very open. After all, I have Service Status pages, why don't they?

Peter Warren is a guest blogger. 

Thursday, 14 Feb 2008

I've written previously about some of the changes in ITIL V3 - and in particular Service Requests. My post today is about one of the really nice enhancements in Serio Release 5 to help customers cope better with Service Requests (which can include low-risk, high-volume mini Changes).

The objective was to create something that would help Agents step through a series of steps, offer guidance on what to do, and show others where the SR was in relation to completion - and be simple to set-up and use.

So, we've added something called Checklists. Checklists are really simple, and will be available in both the Serio Helpdesk and Serio Service Desk products.

Think of Checklists as Change Management Light.

The role of the Checklist is to present those working with Service Requests (as well as Incidents, Problems and Changes if you so wish) with:

  • A tick list of tasks (called Jobs) that they have to do in order to resolve the ticket.
  • Help and guidance on how to complete these Jobs.
  • Reminders as to what they've done, and what they need to do next
  • A means whereby the ticket can't be closed until the Checklist is either finished or cancelled.

One additional design goal was simplicity - and I think that's certainly been met. Setting-up a new Checklist takes just minutes, and is easy to amend when you need to. I made a Checklist this morning for making tea in about 10 minutes.

Figure 1 shows how my 'tea' Checklist looks when you're using the tool. Notice also that the instructions for each Job in the Checklist can have a rich text description.Checklist

Figure 1 - 'Tea' Checklist example

Checklists are started, actioned and completed via Actions - so you can usefully combine them with other things like Reminders, email and so on.

You can also use Checklists within existing Change Plans - something you might want to do if you need to make your existing Change Plan Templates simpler (by using Checklists within each Task).

Also includes as part of this enhancement are some new Columns for your Issue Display:

  • Checklist Name
  • Checklist Status (In-Progress, Completed or Cancelled)
  • Current Job
  • Date and time last Job was completed

and we've also added a Checklist search option to queries.

Monday, 11 Feb 2008

I'm asked to write today for a customer who is taking their first steps with Problem Management - in particular, I'm asked to write about practical strategies for Problem resolution. We have other Problem Management resources, such as a white paper and blog posts here, here and here.

So what do I mean by practical strategies? Something that is apparently not obvious to all engineers - I have a problem, how do I resolve it (or more accurately, get to the bottom of it)? I'm going to assume that the Problem Description is accurate and complete, and that you've got access to an accurate CMDB.

1. Use the Internet wisely.

For some Problems (but not all), engineers will be using the Internet. Searching the internet is not as straight forward as it sounds. Keep these tips in mind.

- Use different search engines. Results can vary widely, and some search engines (like Google) place less emphasis on-page factors (such as titles and content) than they do on links. Others, like, place somewhat more emphasis on on page content.

So, make sure that if you don't find what you are looking for on Google, try Yahoo or

- Know how to perform advanced searching. For example, suppose you are trying to search a vendor website for an error message, but find that there is either no search mechanism on the site, or that the vendor website search mechanism returns poor results. Did you know that you can use internet search engines to search the site? (I almost never use a website's built-in search mechanism).

For instance, to search for the word kpi on, you can type this into your favourite search engine: kpi

and you will get results only from with the term kpi. Similarly, to search for a phrase such as "Incident Management" you can use "incident management"

The site command works on Google, Yahoo and Live.

2. Maintain a list of useful web sites, organised by topic. This is a perfect candidate for a Knowledgebase document.

3. Identify the right expert forums, and participate.

Take the time to find the best expert newsgroups and forums for your particular technical discipline. Most of these are free, although a few do require a small fee for registration. This gives you a chance to discuss issues with others, and from experience more people will take the time to reply to your questions if you are seen to be helping others.

4. Always try to replicate the error.

In my experience this is something that not everyone does, but it's usually worthwhile. For some situations it's not practical, but most of the time it is. Even if the error is not repeatable that in itself is a fact worth knowing and recording.

5. Maintain test systems in advance of actually needing them.

If you do this you'll save a lot of time in trying to replicate errors and you'll be quicker in devising workarounds.

6. Make sure the Problem Management team know about - it's an excellent (searchable) usenet archive.

Tuesday, 05 Feb 2008

This is a continuation of last weeks post Making a Start with ITSM Reporting.

I mentioned First Time Fix Rate then as a measure of quality.

Staying with this theme of measures of quality for the Helpdesk or Service Desk, have a look at your telephone statistics. Examine how long it takes the phone to get answered, and what your call abandonment rate is at different times of day. Also check when calls are actually coming in - if there is a gap (say calls start as 8:00am but your service does not start until 9:00am it might mean you should re-examine your SLA).

Next take a look at your backlog. By backlog, I mean the number of unresolved or active tickets you have at the start of your reporting interval when compared with the number at the end - this can be looked at for Incidents, Problems or Changes depending on what processes you are actually running.

In Serio, there are lots of ways to get this but probably the easiest is from running some of the Executive Summary reports (for example, Report ES1, which you'll find attached). Aside from getting this data for your current period you can also go back 3 months, and determine if the trend is neutral, rising or falling - and then draw conclusions or take action as appropriate.

So far, if you taken all of this data, you've got a measure of quality of service from your front-facing teams, and a very broad measure of throughput through the system. Now let's look at the 'back-end' - those resolving tickets.

Staying with easily-available statistics, you can look at time-to-fix data (which is mainly focused on Incidents). This kind of data is simply a measure of how long it takes, in working time, to get from an Incident being logged to it being resolved. Again there are lots of ways to access this - in SerioReports, have a look at the SLA Analysis reports - or if you want less detailed data, you'll find it also in the Executive Summary reports I mentioned earlier. For example, you could run Report SLA5 which is a simple results-against-target analysis, or run a time distribution report like SLA12. YOu can find both reports attached.

The point of these reports it that is a measure of how much time it takes to resolve an Incident - and allows you to check that what you've agreed with your customers is what is actually being achieved. If you don't have an agreement with your customers (you have no Service Level Agreement of any kind) right now, create some targets for yourself and your team - and then measure your performance against these along with creating a Service Level Management process to go with them.

Other data that can be revealing is customer satisfaction survey data. You can ask Serio (and many other ITSM tools) to gather this for you as you resolve tickets, and provides a way to guage customer perceptions of the service you are providing.

In these posts I've tried quite hard to focus on easily available data, and to write for someone getting started. More mature ITSM environments might include this data but would probably also include Availability (see the Availability white paper), costs of downtime, Problem and Change metrics, and a more detailed SLA analysis.

Thursday, 31 Jan 2008

A colleague here asks me to write about reporting for a customer who is trying to create an IT service management report for the first time, and has little or no Serio experience - and who is not sure what data to use or where to begin.

First of all, I'll list the resources we have here on this website. Probably your first job should be to print and read our Service Desk Metrics White Paper. This white paper discusses different types of data, discusses why we write reports in the first place, and provides a sample reports template you can use.

This subject has come up before on this very blog, in these posts about metrics and KPIs (Key Performance Indicators). These posts might be of use:

Key Performance Indicators for Incident Management

Some Service Level Management Key Performance Indicators

Problem Management KPI Suggestions

Does Your Helpdesk/Service Desk Phone Just Ring Out?

There are others - search the blog for metrics and reporting.

(In case you've ever wondered, the difference between a metric and KPI is this: a metric is just a measure of something, whilst a KPI should be a measure of quality)

It doesn't matter too much if the Categories you've got set-up and are using for Incident logging are a bit of a mess. Clearly this is not ideal and needs to be rectified at some point, but it should not stop you producing a report.

So having said all that, where do you start? You will need to locate and install a copy of SerioReports, as that is where (not surprisingly) most of the ready to run reports are located. Make sure you can connect to your live system with this - you'll find instructions for how to do this in the SerioReports help.

Let's look at some measures of quality we can use.

First Time Fix Rate (FTFR). You'll find this in report AGT14, located under Agent Performance in your Report Explorer. FTFR is one (from many) measures of quality - it tells you how often, when a Customer calls with a problem, that they get an immediate resolution. If your figure is very low (for instance, less than 10%) it might indicate training or skills gaps within your Helpdesk or Service Desk team, issues with morale or motivation, or simply that the problems you deal with are of such a complex nature that FTFR will always be low.

This is where your judgement and skills as a manager will come into play - understanding why things are the way they are, and making recommendations for improvement.

Whilst you are in that area of SerioReports, have a look at who is resolving tickets by examining report AGT21 or AGT5.

I'll continue this post either tomorrow or early next week.

Tuesday, 29 Jan 2008

It's not often I'm thrilled with the idea of new technology and gadgets. As someone in their mid-forties I'm old enough to remember the Pen PC from the late 90's (sank without trace), Prestel (ditto), LED-display watches (useless and uncool) and a whole bunch of other stuff that was going to be 'mainstream' and 'big'.

So I'm slightly sceptical about new gadgets generally. My experience is that consumers are much more conservative than most PR-companies expect.

However, one thing I've seen recently has had certainly caught my attention.

It's called a Readius (there is a youtube clip here) from Polymer Vision, and features a new type of display - one that folds. I use a PDA, but one of the things that irritates me is the size of the screen - I just can't see everything I want to, particularly when using the Internet. The size of the screen is the major thing that affects portability as the screen can't bend or fold - so I'm stuck with a few square inches to squint into.

That is until the Readius. This screen  folds out so you can read it - almost like paper. It means that for a smaller device than I carry now, I can have a bigger screen - offering the promise of a usable display that will fit in my suit pocket.

Right now the fold-out display is greyscale (fine for what I want) but features a very low power consumption footprint (battery life on my current HP PDA is not brilliant).

Of course, what would make this fly off the shelves of technology dealers is Internet capability - attaching the screen to a 3G phone to make a truly portable mobile Internet device with a decent, usable screen. Alas this is where the device falls down - it simply (at the moment) does not refresh fast enough to be used in this way, although it is promised for a couple of years time. As the price for the Readius seems to be in the order of USD800, it makes it a very expensive toy until it can access websites and comes attached to a device with a browser. In the meantime, I'm still interested enough to consider buying.

It will also be interesting to see what the reliability is like - will the folding lead to cracks and expensive warranty claims?

However, improving displays puts the focus on keyboards - or lack of keyboard. They are either like arcade games (Blackberry) or much too large to carry (Pocketop Wireless). Hopefully the people at Polymer Vision will come up with a solution soon to this. [tags] gadget, readius [/tags]

Thursday, 24 Jan 2008

My post today is about something new in ITIL Version 3 – Service Requests. Actually, I say new, but Service Requests (SR) were actually in ITIL V2 (which most of you will be familiar with) but one of the welcome changes in ITIL Release 3 is to make the definition and role much clearer. An earlier V3 post is here.

A Service Request is defined thus:

A Service Request is a request from a user for advice, information, a routine change or access to some IT service.

The most obvious example of a Service Request is someone asking for a password reset - but it could be someone asking for some desktop application to be installed, or asking for login rights to some system or service. Generally, they are typified by relatively modest amounts of effort (by the Service Desk) to complete, and little risk to the business. If there is expenditure involved it's usually modest or all agreed up-front.

In the past, many companies will have handled Service Requests as special types of Incidents, or as Changes – but defining them separately gives us an opportunity to have better reporting, and in some cases to reduce administrative time.

As in all cases there are a few downsides (which I think can be safely navigated with a little planning).

  • There is the possibility of confusion between Incidents and SRs, and Changes and SRs. A little training and definition will hopefully overcome this.
  • Service Request bring into focus the need to have help with determining which systems different Customers can reasonably request access to (and what they already have access to). You 'll be pleased to hear we are adding new functionality to help with this.
  • Higher risk or more costly Changes being handled as SRs for administrative convenience - but again, with sufficient control this can be avoided.

We are changing Serio to meet this new or revised definition – some of the changes are actually quite significant and will be released as part of Serio Version 5. I'll write about these later. 

Monday, 14 Jan 2008

Back in October, I started to look at the next version of Microsoft’s server operating system – Windows Server 2008. In that post I concentrated on two of the new technologies – Server Core and Windows Server Virtualization (since renamed as Hyper-V).

For those who have installed previous versions of Windows Server, Windows Server 2008 setup will be totally new. Windows Vista users will be familiar with some of the concepts, but Windows Server takes things a step further with simplified configuration and role-based administration.

Using a technology known as Windows PE, the new setup model allows multiple builds to be stored in a single image (using the .WIM file format). Because many of these builds will share the same files, single instance storage is used to reduce the volume of disk space required, allowing six operating system versions to fit into one DVD image (with plenty of free space).

The first stage of the setup process is about collecting information. Windows Setup now asks fewer questions and instead of being spread throughout the process (anybody ever left a server installation running and then returned to find it had stopped half way through for input of some networking details?) the information is all gathered at this first stage in the process. After gathering details for the language, time and currency, keyboard, product key (which can be left and entered later), version of Windows to install, license agreement and selection of a disk on which to install the operating system (including options for disk management), Windows Setup is ready to begin the installation. Incidentally, it’s probably worth noting that SATA disk controllers have been problematic when setting up previous versions of Windows. Windows Server 2008 had no issues with the motherboard SATA controller on the Dell server that I used for my research.

After collecting information, Windows Setup moves on to the actual installation. This consists of copying files, expanding files (which took about 10 minutes on my system), installing features, installing updates, two reboots and completing installation. One final reboot brings the system up to the login screen after which Windows is installed. On my server (with a fast processor, but only 512MB of RAM) the whole process took around 20 minutes.

At this point you may be wondering where the computer name, domain name, etc. is entered. Windows Setup initially installs the server into a workgroup (called WORKGROUP) and uses an automatically generated computer name. The Administrator password must be changed at first logon, after which the desktop is prepared and loaded.

Windows Server 2003 included an HTML application called the Configure Your Server Wizard and service pack 1 added the post-setup security updates (PSSU) functionality to allow the application of updates before enabling non-essential services. In Windows Server 2008 this is enhanced with a feature called the Initial Tasks Configuration Wizard. This takes an administrator through the final steps in setup (or initial tasks in configuration):

  1. Provide computer information – configure networking, change the computer name and join a domain.
  2. Update this server – enable Automatic Updates and Windows Error Reporting, download the latest updates.
  3. Customise this server – add roles or features, enable Remote Desktop, configure Windows Firewall (now enabled by default).

Roles and Features are an important change in Windows Server 2008. The enhanced role-based administration model provides a simple approach for an administrator to install Windows components and configure the firewall to allow access in a secure manner. At release candidate 1 (RC1), Windows Server 2008 includes 17 roles (e.g. Active Directory Domain Services, DHCP Server, DNS Server, Web Server, etc.) and 35 features (e.g. failover clustering, .NET Framework 3.0, Telnet Server, Windows PowerShell).

Finally, all of the initial configuration tasks can be saved as HTML for printing, storage, or e-mailing (e.g. to a configuration management system).

Although Windows Server 2008 includes many familiar Microsoft Management Console snap-ins, it includes a new console which is intended to act as a central point of administration – Server Manager. Broken out into Roles, Features, Diagnostics (Event Viewer, Reliability and Performance, and Device Manager), Configuration (Task Scheduler, Windows Firewall with Advanced Security, Services, WMI Control and Local Users and Groups)and Storage (Windows Server Backup and Disk Management), Server Manager provides most of the information that an administrator needs – all in one place.

It’s worth noting that the Initial Tasks Configuration Wizard and Server Manager do not apply for Server Core installations. Server Manager can be used to remotely administer a computer running Server Core, or hardcore administrators can configure the server from the command line.

So that's Windows Server 2008 setup and configuration in a nutshell. Greatly simplified. More secure. Much faster.

Of course, there are options for customising Windows images and pre-defining setup options but these are beyond the scope of this article. Further information can be found elsewhere on the ‘net – I recommend starting with the Microsoft Deployment Getting Started Guide.

Windows Server 2008 will be launched on 27 February 2008. It seems unlikely that it will be available for purchase in stores at that time; however corporate users with volume license agreements should have access to the final code by then. In the meantime, it's worth checking out Microsoft's Windows Server 2008 website and the Windows Server UK User Group.

Tuesday, 08 Jan 2008

This is just a very brief follow-up to Duncan's last post, which mentions Known Errors but does not define them.

A Known Error is an output from Problem Management (or more accurately, your Problem Resolution process). For a definition of a problem, click here.

If you think of a Problem as being something you don't understand, think of a Known Error as something you do understand – even if you don't know yet how to fix it just yet.

In the case of a software bug, it would be after analysis of source code and algorithms. In the case of an infrastructure problem it is after carefully verifying the conditions necessary for repetition of the Problem and (ideally) identifying the faulty components.

Known Errors typically have two parts. The first of these is the description of the Known Error itself, showing users or product modules and versions affected. The second is the Workaround and/or Change:

  • Workaround – a way to bypass the fault you've previous described that can be used by Customers.
  • Change – to resolve the underlying Problem.

In practice, many support professionals people seem to raise Known Errors whenever any error condition or bug is proved as repeatable, rather waiting for the Problem to be diagnosed (fully understood). Also, many raise a Known Error even if a Workaround is not currently available – simply to help Incident logging staff.

Known Errors are used during the Incident logging and resolution process, as a source of Workarounds for customers, and as an information resource for Customer-facing Incident handling staff. Because of this, we've made it so that searching Known Errors is quick and painless from Incident logging in Serio Release 5.