Serio Blog

Monday, 15 Jan 2007

This is a continuation of the earlier ‘Incident Life Cycle’ post.

I’ll start by talking about Escalation. This is a term that seems to mean different things for different people. For some, it means the Helpdesk or Service Desk assigning an Incident to a more expert team (or third party supplier) when the nature of the Incident and the skills in our specialist groups dictate that we must do so.

For some, it means adjusting the priority of the Incident (usually upwards).

For others, it means changing the Incident and alerting staff as it becomes possible the resolution will be late.

Fortunately ITIL provides us with some useful definitions – Functional Escalation, and Hierarchical Escalation.

  • Functional Escalation refers the process of assigning an Incident from one team to another based on the skills required to resolve the Incident – for example, assigning an issue with a database backup to the DBA team.
  • Hierarchical Escalation refers to a process whereby we take action to avert the resolution of an Incident being unsatisfactory or late.

These two types of Escalation are not mutually exclusive: you may, as part of your Incident Management process, do both.

A number of strategies are used in Functional Escalation. Some of those I’ve encountered being used successfully include

  • An ‘open’ system, whereby any Agent wishing to reassign an Incident assigned to them can do so. This assumes that staff work diligently on Incidents, and will not needlessly re-assign Incidents for no other reason that they want to focus on projects of most interest to them. Generally ‘open’ approaches like this work best in smaller groups, where expertise and responsibility is very clearly defined.
  • A ‘refer and request’ system, whereby any of the specialist teams who wish to have a reassignment can reassign the Incident back on the Helpdesk of Service Desk only, with a request for assignment to another teams (for instance, it turns out after diagnosis that a different team needs to become involved). This stops a ‘pass the hot potato’ mentality developing, helps the Service Desk team keep involved, and is one of my preferred approaches.

The ‘open’ and ‘refer and request’ methods of Escalation are by no means the only possible approaches, but they are two that are used very frequently in Incident Management.

Please note that both of the approaches I’ve discussed above are supported by Serio, as are other variations.

I’ll continue this thread in future posts this week, and also discuss Hierarchical Escalation.

Wednesday, 10 Jan 2007

This is a follow-on post from Introducing Incident Management.

This blog post is going to talk about what ITIL calls the ‘the Incident Life Cycle’. If it sounds complicated it isn’t – and it’s probably something you’ll recognise from your own Incident handling work.

Incident Life Cycle

Incident logging. This is simply where we have reported to us (or detect through automated tools) that an Incident has occurred.

Classification. We classify the Incident, and offer initial support to the customer. At this stage we are looking to get a handle of the business impact of the Incident, assigning a priority and seriousness, and classify the type of Incident.

Investigation, analysis and diagnosis. This is the ‘meat’ of the process, where we try to understand how we can restore service to users. It’s important to note here that resolution of the Incident should not be the only thing that we are considering as outputs – workarounds are useful to users in some cases (for instance, where we see the resolution of the Incident as possibly staking some time).

Resolution. Taking our analysis we resolve the Incident and restore services to users. Note that this may have entailed raise a Change and/or Problem.

Incident Closure. We close the Incident, allocating Cause data and contract codes as appropriate.

What I want to talk about know is something referred to by ITIL as the ‘Workflow Position’ – this is something that confuses some people, but again is really quite straightforward.

Refer to the life cycle above, and consider this. We log and Incident, and at some time in the future we resolve an Incident. In the middle we ‘do stuff’. For example, you might

  • Assign the Incident to a specialist team for assessment
  • Put the Incident on hold, because the Customer has gone on a 3 month holiday
  • Escalate the Incident to a Team Leader

The Workflow Position is simply a signpost about where we are with an Incident. In the ‘Service Support’ book (published by TSO) the following examples are listed:

New, Accepted, Scheduled, Assigned to Specialist, Work in Progress, On Hold, Resolved, Closed

Please note that ITIL is not prescriptive, so these are listed as examples to communicate meaning and foster understanding. It doesn’t mean that is what you must have in order to be ‘ITIL Compliant’ – you are expected, as someone who works in ITSM, to fit this to your particular organisation (and that means extending these with status values that make sense to you).

Workflow Position is useful in Incident Management, and here’s why. It allows you to easily create status reports that tell you where you are with the unresolved Incidents you are currently handling. For example, you might run a status report that says:

Waiting for customer response 12

On hold 6

In progress 30

With external suppliers 10

Awaiting assessment 12

This is more useful than simply saying ’70 Incidents open’ as a status of our current status.

I’ll continue this thread about Incident Management in coming posts.


Monday, 08 Jan 2007

A Happy New Year to all our blog readers!

This blog may be the first in a series on the subject of Incident Management – a fundamental part of ITSM. I’m going to start with the basics, and then take it from there.

The first thing I need is a definition, of what an Incident is. ‘Best practice for Service Support’ (pub TSO) very kindly gives us a definition as follows:

any event which is not part of the standard operation of a service and which causes, or may cause, an interruption to, or reduction in, the quality of a service.

Incidents, therefore, cover a multitude of faults, errors and unexpected events users might experience. An Incident can also be taken as a simple user query like ‘how do I do a mail merge?’ or a password reset request.

Having defined Incidents, it’s worth stating the goal of Incident Management: to restore services as quickly as possible to users, thereby reducing the impact of Incidents upon the organisation. This is not to say that is all we do: there are quite a lot more tasks involved, but this is the ‘output’ or primary goal of the Incident Management process (remember that a general goal of IT Service Management is to stop Incident happening in the first place).

The role of the Helpdesk/Service Desk in Incident Management is key – even if specialist teams (such as network operations or database administrators) happen to be assigned to the Incident. This is because the Helpdesk/Service Desk should ‘own’ the total pool of active Incidents, taking action where required to ensure a timely resolution or workaround for the customer. This is why Serio allows you to have both an Assigned Team and Agent, and an Owning Team and Agent for an Incident – so that the actual ownership of this ‘pool’ of unresolved tickets can be split between members of the Helpdesk or Service Desk.

To illustrate this with an example, you might have a Service Desk agent called John who logs an Incident for a database error. John might have to assign this to the specialist Database team for resolution, but still maintain some ownership (distinct from assignment) for the Incident in question. What John actual does with the Incidents he owns will depend upon his companies’ own procedures, but he might:

  • Intervene and review the Incident at some point before a resolution time (SLA) breach
  • He might manage communication with the customer
  • He might be involved with the further escalation of the Incident

I’ll continue to address this subject in later posts.

Friday, 05 Jan 2007

This is a continuation of the post from a few days ago SNMP for Beginners.

At the end of the post I posed a question about interoperability. How do management applications know the data that network devices can provide, or what their capabilities are? This is where the MIB comes in. Think of a MIB as a definition of what the device can provide, or a menu of its capabilities. If you give the MIB to the management application, it can has knowledge of what the device can provide.

MIB stands for Management Information Base. MIB files are human readable – sort of. They use a programming-like language to declare the data that the device can do. The job of management applications, like our own Command Center tool, is to read these MIBs and use them to access the network device in question.

If you wish to see a MIB, please see the attached file.

This poses a number of questions, like ‘where are my MIBs then?’. Generally they are distributed with the product in question (for instance, on the support CD) or from the vendors web site. However, your device might not have its own specific MIBs, it might simply use standard MIBs – of which there are many. Whatever your device of situation, your vendor documentation is the best placer to start looking for MIBs.

Later in the month I hope to follow-up this post with information about some of the standard MIBs, and explain how they can be used.

Tuesday, 02 Jan 2007

This post is about a technology that Serio uses a lot – SNMP. I will write about SNMP in this blog entry and assume that you have no idea what it is.

Those of us here at Serio that write in the blog have some useful guidelines, the first of which is this: ‘define what it is you are talking about in your own words’. So the first thing is to try to define what SNMP is. SNMP: Simple Network Management Protocol. It’s a way of different devices on a network infrastructure exchanging information, typically between a Management application and a Network Device. At it’s simplest level, think of it this way. Imagine you have a device, say a hub, that is down the corridor and handles a lot of important traffic. On the front of this device is a status LED (flashing light!) that tells you that the hub is OK (green LED) or has a problem (red LED). Now imagine you have a Network Management guy called Duncan, who is interested in this status value. Every hour Duncan gets out of his chair and walks down the corridor to look at the LED.

Pretty soon Duncan gets tired of this, and wishes he could ‘read’ the status value from his computer, or better still have some kind of application (a management application) that can read the LED value for him, and let him know when it goes red.

This is where SNMP comes into play. It’s basically a way for the LED value to be transmitted over the network, and defines a common data structure whereby the management application can request the value of the LED, the hub can send it, and the management application can understand what is being sent.

In other words, SNMP is generally concerned with sending small pieces of information like status values and small strings. It isn’t something you would use for regular traffic, like audio or file data. In fact, most network devices impose quite strict limits of the size of SNMP requests and responses – typically less than 2K bytes.

I’ll post more on this subject later this week. In the meantime, consider this problem: how does a management application know what status information a network device can provide, or what the values all mean?

from everyone at Serio, to all of our customers, blog readers and RSS subscribers, wishing you the very best for 2007.

Jackie, blog admin

Friday, 29 Dec 2006

This is a follow-on post on the subject of CTI in Serio (whilst the earlier CTI posts look at CTI more generally).

Serio CTI depends upon the capabilities of your telephony system (the ‘switch’) completely. It relies on your switch for data about incoming calls – not just Caller Line Identity (CLI) but for information about when events (like incoming call or hang-up) are happening, and getting this information in good time (remember, caller waiting!). Serio CTI relies on your switch for making calls, and for data on the progress of those calls.

So how does all this work? Well it’s all done with something called TAPI. TAPI is short for Telephony Application Programming Interface, an API (a set of function libraries and tools) for connecting a PC running Windows to telephone services. TAPI was introduced in 1993 as the result of joint development by Microsoft and Intel. TAPI supports connections by individual computers as well as LAN connections serving many computers.

TAPI functionality on your switch is what you need in order to work with Serio. Serio supports TAPI versions 2.1 and 3.0. In particular, you need a Telephony Service Provider (TSP). Telephony hardware manufacturers typically provide the TSPs - Serio do not provide these. With some telephony devices there are one or more additional software layers between the TSP and the hardware's native functionality. These are usually referred to as drivers and are installed as part of the hardware/software installation process.

One of the guidelines I have for this blog is this: avoid the use of jargon, acronyms, buzz words and anything else that detracts from the objective of this blog: to help people in practical ways with IT Service Management. However, re-reading when I’ve written here I’ve introduced TAPI, API and TSP and you may be thinking ‘that sounds tough’. Generally, you are looking to your telephony hardware provider to give you these tools and help you set them up – hopefully their packages are quick to install and configure (frankly though, this is not always the case).

It’s not just software you need also. With CTI, I think you only get the most when your Agents (those dealing with customers) have the right hardware – and that means headsets that work with the TSP. So that if they make an outbound call, Serio can simply connect them to the speaker at the right moment. Similarly for inbound calls we can simply connect without the Agent having to handle the receiver. I use a headset and would recommend them strongly for anyone using a computer whilst dealing with customers.

More info: see the CTI Roadmap in the HowTo Guide, part of the documentation distributed with the product.

Wednesday, 27 Dec 2006

This is a continuation of my last blog Telephony in IT Service Management.

Whilst my last blog entry looked at statistics, this blog entry will look at how Computer-Telephony Integration (CTI) can be used to make the job of the Helpdesk or Service Desk easier by discussing some of the common applications.

I’ll start with computer dialling. This is where you have a button on your ITSM tool that says ‘Dial’. You click it and the ITSM tool takes the number from the customer database and dials the customer for you, connecting you on answer. You may be thinking ‘so what?’ but I’ll say this – it is a really good time saver, and something you can very quickly get used to.

Another application is commonly referred to as ‘screen popping’. Whilst computer dialling is concerned with outbound traffic, screen popping addresses inbound traffic. What happens is this: when a customers calls you, the ITSM tool detects who is calling and then ‘pops’ a screen the tells you information about the caller, such as their name, which company they work for, and ideally shows you their recent Incident history – so they can be greeted with ‘good morning Mr Smith – how can I help you?’.

Screen popping works using a piece of data called CLI (Caller Line Identity). This is where someone calling a number makes their number visible. If you have used a mobile phone and observed the caller’s number being shown on the LCD display of your phone, then you’ve seen CLI in action.

What typically happens is this: your telephone system detects the incoming call, obtains the CLI data, and then passes this to your ITSM tool which scans its own database and identifies the caller – and this allows the ‘screen pop’ to take place with the right data.

If you deal with internal customers only then obtaining CLI information is usually unproblematic. If you deal with external customers, then CLI might not be available to you – some companies ‘hide’ their number when making outbound calls (though this unwelcome practice seems to be less common today than a few years ago). Another problem might be a large company applying the same CLI data to all outbound lines from their company, making it impossible for you to tell the actual individual calling.

If you are a Serio customer, you might be asking what Serio can do with CTI. The answer is that both outbound calling and screen popping are supported, but dependent upon your phone system having the right hardware and software capabilities. I’ll post later about what you need for CTI in terms of system requirements.

Friday, 22 Dec 2006

On behalf of everyone at Serio allow me to wish all blog readers and Serio customers a very happy and peaceful Christmas and New Year.
We are closed on 25th, 26th and Jan 1st. If you need help our support on these days simply email support __at__ and someone will pick it up. Make sure you put URGENT in the title somewhere.

Jackie, Blog admin

Telephony statistics, and telephony integration, is often overlooked in the IT service management mix. In this blog, I’m going to discuss a few aspects of these and how they can enable you to offer a better service for customers.

Firstly telephony statistics. Does your management reporting look at this kind of data? There are a number of key statistics you might want to look at, as follows:

  • Number of rings before pickup
  • Call abandonment rate
  • Distribution of incoming calls during the day

Number of rings before pickup. This how long does it take to get an answer when someone calls, and needs to be handled with a little care. An average number of rings figure is good to have, but you need to take into account ‘spikes’ that might be short-lived but point to unacceptable levels of service. For this kind of analysis it is sometimes useful to express this figure as a simply X-Y plot over time, or expressed with statistical measures such as standard deviation or max/min measures. You’ll typically get this from your ACD system.

Call abandonment rate. This is the number of times a caller gives up and puts down the phone. Apart from an actual count, it’s good to have this figure represented as a time-based graph so you can see if there are problems related to a particular time of day or shift pattern. Again this will come from your ACD system.

Distribution of incoming calls. This is a useful figure to have, as it tells you when your Helpdesk/Service Desk faces its highest volumes. In many cases you’ll see a smoothed day-by-day graph, a ‘twin bell’ shape, with a surge in the early to mid morning period, and again in the early to mid afternoon period. You often have two sources of information for this data – your can use your ITSM tool (Serio users see report IL14 ‘Daily Activity’) or your ACD system. My preference is the ACD system, as this captures all calls, even those made to discuss last night’s football results.

Telephony doesn’t just provide us with management data, useful though that is. You can also integrate your phone switch with your ITSM tool – something Serio has supported for some time. It’s called CTI (Computer Telephony Integration) and I’ll post later about the kind of things CTI can do for you.

In the meantime, have a happy Christmas and pie-filled New Year!