Serio Blog

Thursday, 30 Nov 2006

Over at the Verso blog there's a post called Service Desk or Help Desk - What's in a name? It may help to answer the question we are sometimes asked about this blog and the fact we oftesometimes use both terms, and other times just use one.

Wednesday, 29 Nov 2006

This topic was suggested by a colleague of mine here at Serio, who asked me to write about how Helpdesk or Service Desk managers can help to give their staff further challenges at work, and/or a sense of career progression. He put it more prosaically : ‘what can you do to stop staff feeling like they are stuck in a rut and leaving?’. His comment was triggered by a conversation with a long-standing customer who had lost one of their brightest and best.

As always, I’ll try and stay with the practical. Like a lot of good things, what I’m about to say may seem somewhat obvious but you’d be surprised at how often it is neglected by what are otherwise very capable managers.

Lots of things can lower employee morale, such as poor pay, permanently angry customers, but I’m going to stick to how you give staff new challenges in what might seem to be an environment with limited promotional opportunities.

It really doesn’t matter if have no ‘promotion’ positions available, because what you can do is to give your staff new functions to perform, and this is where ITIL becomes really useful. Let us take the case of a Helpdesk or Service Desk with 6 staff and a manager, where the manager is responsible for producing monthly reports, analysing service levels achieved and so on.

This is quite typical in my experience – we have a manager and then a group of people who are responsible for dealing with customers day to day.

In this situation, you can (as an example) create a new role of ‘Service Level Manager’, and ask one of the staff to extend their repertoire of skills to include this. In doing so, it is absolutely critical that you spell-out for the person concerned what they have to do and be both specific and practical. You might say:

  • Produce monthly reports on Service Levels achieved (giving a sample report), including a summary and conclusion
  • Produce recommendations for improvements where Service Levels fall below that expected or required (again giving some examples)
  • Use the ITSM tool to monitor ongoing Service Levels day-to-day (again being specific. If you were a Serio user you might say ‘Act on the Response Breach Warnings that are sent’)

In order to make this work, you have to hand-over all (or almost all) responsibility. What I mean by this is that if you are discussing new Service Levels with customers, your Service Level Manager might take the lead (which doesn’t mean that the Service Desk or IT Manager is not involved, just that someone else leads the discussions and leads the process of agreement). Also, the person concerned should have a visible part to play in implementing successfully their own recommendations, and be able to take some credit for service improvements that result.

In other words, there is some scope for initiative. If you just ask someone to compile reports that will not achieve what I am talking about here.

Other benefits come from this approach, the principal of which is a wider appreciation of, and involvement in, IT service management.

Make sure it is clear both to the person concerned, and the rest of the team, that the role is a reward for competence and hard work. This gives them incentives and makes it clear that it is possible to progress within your own company.

Please take a look at the two White Papers attached that may be of interest.

Monday, 27 Nov 2006

This post is next in the series on Writing your own Custom Reports.

The last post was all about introducing the SELECT statement, and showing how you can use it to access data.

There are a few points to draw from the previous post, and they are:

Understand your data. You can’t write any report yourself unless you have a clear idea of what data is available. The devil is all in the detail here, and if you want to create a report yourself then you have to grasp the detail.

For example, understanding that the sv_issue_basics View contains both Incident, Problem and Change data is an important detail. You find that out by reading the comments about the View, but you can also infer it by looking at the columns. Since one of the columns has is type_code, and this is I for Incident, p for Problem and c for Change it is easy to deduce this View has all three types of data, and that this column allows you to filter.

Build your query in stages. Just like I did in the earlier post, you can gradually develop your query in stages.

In this post, I’m going to refine the query further, and show you some of the other things we can do with a SELECT.

We finished with this SQL statement

select issue_no, issue_logging_date_time, issue_priority

from sv_issue_basics

where issue_logging_date_time >= ‘2006-10-01′

and issue_logging_date_time <= '2006-10-31'

and type_code = 'i'

order by issue_no

Note that the first part of the statement says what columns we want to see. There is a shorthand way of saying we want to see every column, using *, as in this example below

select *

from sv_issue_basics

where issue_logging_date_time >= ‘2006-10-01′

and issue_logging_date_time <= '2006-10-31'

and type_code = 'i'

order by issue_no

Grouping and Counting

If you try to write a report yourself, you will almost certainly want to both group by different data, and to count instances of particular groups. A lot of the reports in SerioReports do this. This post will show you how.

The Task

Produce a report for October that groups Incidents by Priority, and counts the number logged for each Priority.

What we have to do is introduce a new clause – the GROUP BY clause. This clause tells the database to ‘fold’ rows that are similar together – in this case, by Priority. Since we wish to have a count of Incidents, we can use the function specially provided in SQL for this, the COUNT function (these are called aggregate functions).

select issue_priority, count(*)

from sv_issue_basics

where issue_logging_date_time >= '2006-10-01'

and issue_logging_date_time <= '2006-10-31'

and type_code = 'i'

group by issue_priority

order by issue_priority

If you wish to GROUP results, here are a few simple rules to help you get it right. 

- If you include a column in the SELECT clause that is not an Aggregate function, you must include it in the GROUP clause. 

- Generally it is a good idea to include GROUPed columns in the ORDER clause. Using these rules, I can extend the query to group by Priority and Problem Area, as follows.

SQL query 2

 

select issue_priority, problem_area, count(*)

from sv_issue_basics

where issue_logging_date_time >= '2006-10-01'

and issue_logging_date_time <= '2006-10-31'

and type_code = 'i'

group by issue_priority, problem_area

order by issue_priority, problem_area

 

 

 

 

Figure 1 - Query being run

 

Friday, 24 Nov 2006

I intend to use the subject of reporting for a series of blog articles, covering how to create some simple reports for your own Helpdesk or Service Desk.

Rather than using a software tool such as Crystal, which I know a lot of you do not have, I will to use tools that come with both Microsoft SQL Server and Oracle to create simple SQL queries, and will illustrate how the Serio reporting Views can help you. If you understand how to use the Views, and a little bit of SQL, you will have all the skills you need to create your own reports in a reporting tool.

I will assume you have never used SQL before.

Jargon Buster

SQL: Short for Structured Query Language. This is a language used to interact with a database system. For the purposes of this tutorial it is the language we will use to specify the data we are interested in.

View: All this means is ‘virtual table’. Your database stores data (such as your Incidents or Configuration Items) in tables, a bit like a spreadsheet. Each table is made up of columns, each of which has a name, and rows which contain data. Views are created for the convenience of those writing reports. If you are still baffled by that do not worry – think of each View as a convenient way for you to generate reports.

Getting Started

You’ll need to get hold of the Serio Developer Toolkit for Crystal. You can get this from Serio if you ask – it is free. Follow the instructions in the Kit to create the Views I will use in this tutorial series. Don’t be put off, as it takes no more than 20 minutes to create the Views (even if you are a novice).

Tools for creating and running queries

Microsoft SQL Server: Use the SQL Query Analyser. You can access this from the SQL Server client tools, installable from the SQL Server CD (make sure you install the service packs available from Microsoft). Ask your DBA for assistance in set-up and installation.

Oracle: SQL*Plus. Yes I know it’s a fairly primitive tool, it is it shipped with all Oracle systems. It will be on the Oracle CD. Ask your DBA for assistance in set-up and installation.

Documentation on the Views is available with the Developer Kit. Look in the ‘Schema.xls’ document, and click on the Views worksheet.

Creating our first query

For the first query, I’m going to use the very convenient View sv_issue_basics. This View contains information about every Incident, Problem and Change ever logged in your Serio system.

The Task

Create a query that lists the reference number, date of logging and priority for each Incident logged between 1st October 2006 and 31st October 2006.

To read data from the database, we need to issue a SELECT statement. Rather than explain SELECT I’m simply going to write a statement, and then refine it.

select issue_no, issue_logging_date_time, issue_priority

from sv_issue_basics

It’s probably not a good idea to run this query just yet. We have not specified that we are only interested in data from October – so this query will return information about each Incident, Problem and Change we’ve ever logged! What we need is to specify a condition, and you do that as follows:

select issue_no, issue_logging_date_time, issue_priority

from sv_issue_basics

where issue_logging_date_time >= '2006-10-01'

and issue_logging_date_time <= '2006-10-31'

SQL query

Figure 1 - Query being executed on SQL Server 

I said that sv_issue_basics contains information about all Incidents, Problems and Changes, but we are not telling the database that we only want Incident data. Looking at the documentation for sv_issue_basics, we can see a column called ‘type_code’ that ‘equals i for an Incident, p for a Problem and C for a Change’. This looks like a perfect way to just deal with Incidents, allowing me to add a clause as follows.

select issue_no, issue_logging_date_time, issue_priority

from sv_issue_basics

where issue_logging_date_time >= '2006-10-01'

and issue_logging_date_time <= '2006-10-31'

and type_code = 'i'

If you run this query, you’ll see that it now returns just Incident data, but you’ll probably find that the data returned is unsorted – because we have not told the database to sort it before returning to us. The following statement applies a sort:

select issue_no, issue_logging_date_time, issue_priority

from sv_issue_basics

where issue_logging_date_time >= '2006-10-01'

and issue_logging_date_time <= '2006-10-31' a

nd type_code = 'i'

order by issue_no

This shows the 4 most important parts (but not the only parts) of a SQL statement: select, from, where and order.

We will continue this later, and welcome feedback on this.

Wednesday, 22 Nov 2006

Commentator Peter asks if I can expand on the subject of ‘cost per lost production hour’ in this post on Availability – which is what I intend to do in this post.

Firstly let me say I don’t consider myself an expert in this, having only had to address this once in my career. What I will do is discuss some of the main factors you I believe should consider – if you can think of others, you should add them to the comments below.

Firstly, let’s recap on what we are talking about and why it is important. The subject is the cost of lost production, or the costs of unavailability. Put simply, if you provide a Key Service X to users, and then X becomes unavailable for whatever reason, what is the cost to the enterprise for each hour that X is down? Helpdesk and Service Desk Managers are interested in this because costs accruing to the business are important and legitimate areas of reporting.

So, how do we go about deciding a cost?

Starting with the obvious, costs will be different for different systems or services. I can’t imagine a scenario where a blanket cost would be applied across all systems.

We then need to agree the costs with our key customers and project sponsors – they may have strong opinions on the costs of downtime. Getting agreement with customers is important because they will have much greater faith in the statistics if they have helped to form them.

Here are some of the factors I think you should consider. The emphasis you put on these factors will depend on the service and your own particular circumstances.

Costs of lost business

Some systems (such as sales order processing systems) are directly linked to the ability of the organisation to undertake profitable business. For systems like this, we can determine how much profit we make per hour, and use this as a cost to the enterprise for unavailability per hour. Sometimes seasonal factors may affect this, but I’d advise avoiding over-complication, and advise that you ‘take things in the round’.

Cost of lost reputation

Sometimes the enterprise can, for a period, hide or limit customer exposure to downtime by reverting to manual procedures. In other cases this is not possible, and downtime leads to customer disdain. For an enterprise that trades on its reputation, this can be disastrous. Therefore, we can sometimes estimate a cost for lost prestige in the event of downtime. As you would expect, this is always going to be a value judgement, and may be a contentious issue.

Cost of penalties

Some organisations face financial penalties in the event of downtime. We should always consider these, and it’s usually quite easy to do so as they are defined as part of a contract.

Cost of lost worker hours

We may have groups of workers who need a system to do their jobs – engineers and architects are two groups that spring to mind with CAD-type systems. For these types of workers and others like them, unavailability is the difference between contributing something meaningful to a project and standing around the water cooler talking about last night’s football. In these cases, take a view on the average number of concurrent users for the system, and then average the salary cost-per-hour of a typical worker. Then increase the salary cost per hour by about 1/3rd – this allows for holiday, office space and other costs associated with staff. Take this figure and compute a cost as follows:

Cost per hour = (Average salary cost per hour + (Average salary cost per hour*.33)) X Average Concurrent Users

If you are using Serio, use the User-based downtime reports – store your average number of concurrent users in the CMDB.

Recovery costs

In some cases, after a period of downtime (particularly an extended period) the enterprise has costs arising from recovery – paying staff to work overtime to clear backlogs, inability to follow more profitable business. Whilst this can be very difficult to determine, again take a rounded view and arrive at some easy-to-use estimates.

Monday, 20 Nov 2006

Commentator Robert asks for a ‘page where all the [availability] posts are joined together’. This post is to provide just that.

The first post is an introduction post: ‘Using Serio to obtain Availability statistics’. In short, this post:

  • Defines Availability
  • Points to the White Paper we have on the subject
  • Asks you to think about what your Key Services are
  • Prompts for how Availability data should be presented

Next comes ‘More on Availability statistics’ (not a very imaginative title I know). This post:

  • Discusses identifying what your target for availability should be
  • Asks you to think about how Key Services will be represented in the CMDB

This was followed by ‘Accessing Availability Statistics’. This post:

  • Describes how to use Service Level Agreements (SLAs) with the CMDB
  • How to log Incidents that will deliver the data we need
  • Describes what ‘ingredients’ are used to produced the final Availability graphs

Next I looked at ‘Availability & the Performance Graphs’. This post:

  • Introduced & named the main graphs we use for accessing downtime statistics
  • Looked at formulae for how downtime and availability
  • Discussed the criteria you need to supply to the Performance Graphs

The final post was ‘Availability Reporting Round-up’. This post:

  • Continued the examination of the Performance Graphs, in particular looking at the ‘User-based’ graphs
  • Discussed (briefly) assessing costs associated with lost production.

Thursday, 16 Nov 2006

This is my final post (for the time being) on the subject of Availability reporting. I’ve posted quite a bit about this recently – the previous post in the series is Availability & the Performance Graphs (this has links to the other posts).

Recall from my previous post I listed the Availability graphs, but left discussing them for this post. The graphs I mentioned were

  • Downtime – Item (Monthly)
  • Downtime – Item (Monthly User Based)
  • Downtime – Item (Weekly)
  • Downtime – Item (Weekly User Based)
  • Downtime – System
  • Downtime – System (User Based)

These reports simply show (on a weekly or monthly basis) the total amount of downtime. As you can see, there are broadly two types: those that are ‘User-based’ and those that are not user-based (I’ll call these reports ‘straight downtime‘ reports).

Starting with the straight downtime reports, these just total-up the amount of downtime over the given period. If you’ve had 4 hours downtime in August, that is what the report will show.

The User-based reports do something different. Like the straight reports, they take the amount of downtime that has occurred in the preceding period. However, it is then multiplied by the number of concurrent users for the Key Service in question (this information is taken from the CMDB).

Example: You have a warehousing system that has 30 concurrent users across three sites. This Key Service in a one-month period experiences 2 hours downtime. The User-based downtime reports would show downtime as 2 x 30 = 60 hours.

If the User-based reports sound strange, then here is the intent. You can use them to assess the costs of downtime, because the reports show the amount of lost production hours. As part of your SLA you might agree the cost of a single ‘Lost Production Hour’ for a given Key Service, and from this use the User-based reports for downtime financial reporting.

Coming to a reasonable value for a lost production hour is beyond the scope of this post, but normally it will include an averaged salary value for the users concerned, and may also include a measure for the fact that profitable enterprise has been also lost during downtime – a double whammy for the organisation.

My personal opinion is that the User-based downtime reports are the most useful – they focus attention on the effect of downtime and unavailability on the organisation. I have to comment though that sometimes there is resistance to using these reports in IT departments, because the numbers generated can be very large indeed. However, that should not preclude their use in a properly managed Service Desk or Helpdesk.

Friday, 10 Nov 2006

This is my penultimate post on the subject of Availability statistics. The earlier post on this was Accessing Availability Statistics.

If you’ve read an understood everything up to this point (then give yourself a gold star) you’ll want to know how you can access Availability data.

This information is available through the SerioClient Chapter called ‘Performance’ (click the link underneath ‘Tools’). If you open a Performance Chapter you’ll see the following graphs that you can use:

  • Availability – Item
  • Availability – System
  • Downtime – Item (Monthly)
  • Downtime – Item (Monthly User Based)
  • Downtime – Item (Weekly)
  • Downtime – Item (Weekly User Based)
  • Downtime – System
  • Downtime – System (User Based)

All of these graphs take the ingredients I’ve mentioned before – the Availability SLA, and the Incidents logged against the Item or System – and compute your downtime data for you. All you need to have is accurate Incident records.

Let’s look at these Graphs more closely.

Availability Item, Availability System

These reports express Availability as a percentage over a monthly date range, going back from the report start date a total of 6 months and in so doing allowing the trend of Availability to be discovered. For each month, and for each Item or System , Serio computes the maximum possible Availability based on your SLA in minutes (MAXh), and then computes the amount of downtime (DOWNh) . It then expresses downtime (or more accurately, Availability) as a percentage using this formulae

PCT = ((MAXh – DOWNh) / MAXh) * 100

for each Item or System. As criteria for the graph, you need to supply two items of information:

Impact – So that the calculation of DOWNh is made solely on the basis of Incidents that are related to system unavailability.

Item Type Category (for the Availability Item graph). This is so you can restrict your graph to particular classes of Item (for instance ‘Enterprise servers’).

This style of graph has an interesting side effect – it can make what would otherwise be considered poor availability look quite good.

I’ll complete this post early next week by covering the Downtime reports, and what ‘User based’ means.

In the meantime, have a peaceful weekend, if that’s your thing. Otherwise have a riotous weekend.

Wednesday, 08 Nov 2006

Are you getting the most out of your Serio Knowledgebase? The Serio Knowledgebase is an excellent resource for assisting your Helpdesk/Service Desk in resolving Incidents. Here are a few tips to help you use it more effectively.

Finding Incidents with the Knowledgebase

The Knowledgebase is a great tool for locating Incidents when you remember what they are about, or roughly when they occurred, but can't seem to find them when searching under the appropriate problem area category or customer.

To find Incidents with the Knowledgebase,

  1. Choose 'Manage Incidents' from the SerioClient menu.
  2. Switch to the 'Knowledgebase Query' tab page.
  3. In the Search Phrase box, enter a word or phrase that you think will identify the Incident you are looking for (e.g. Print Problem). Click on 'Search Now!'.

Narrowing Down your Search

If you get too many results, there are lots of ways that you can narrow down your search.

  1. You can put your search phrase into double quotation marks, e.g. "print problem". This restricts Serio to searching for Incidents which contain the exact phrase print problem in their Description or Action History. Without the double quotation marks, Serio can match Incidents that contain either the word print or the word problem.
  2. You can use the keywords and and not in your search phrase. For example, the query print and problem will only match Incidents that contain both words, while the query print and not problem will match Incidents that contain the word print but not problem.
  3. You can narrow your search with criteria. For example, to restrict to searching for Incidents logged in June 2006:

i. Choose 'Incident/Problem/Change' from the 'Search for' drop-down menu, then select 'Incident' from the 'Equal to' drop-down menu, and click the 'Add' button.

ii. Choose 'Month' from the 'Search for' drop-down menu, then select 'Jun' from the 'Equal to' drop-down menu, and click the 'Add' button.

iii. Choose 'Year' from the 'Search for' drop-down menu, then type 2006 in the 'Equal to' field, and click the 'Add' button.

Having refined your search, click 'Search Now!' again.

Choosing Incidents to Work on

Now you have a list of Incidents that you are interested in, how do you start working on these?

  1. Right click on the list of Knowledgebase articles that Serio has found (left pane).
  2. Choose 'Add Issue to Select List' or 'Add All' to select the Incidents you want to work on.
  3. Click on 'Find Incidents'.

There's another good post over at Verso on the subject of Serio and the Service Catalog. This links into George's posts below about Availability statistics (where the subject of representing Key Services in the CMDB is discussed).

Pages