Serio Blog

Friday, 24 Nov 2006

I intend to use the subject of reporting for a series of blog articles, covering how to create some simple reports for your own Helpdesk or Service Desk.

Rather than using a software tool such as Crystal, which I know a lot of you do not have, I will to use tools that come with both Microsoft SQL Server and Oracle to create simple SQL queries, and will illustrate how the Serio reporting Views can help you. If you understand how to use the Views, and a little bit of SQL, you will have all the skills you need to create your own reports in a reporting tool.

I will assume you have never used SQL before.

Jargon Buster

SQL: Short for Structured Query Language. This is a language used to interact with a database system. For the purposes of this tutorial it is the language we will use to specify the data we are interested in.

View: All this means is ‘virtual table’. Your database stores data (such as your Incidents or Configuration Items) in tables, a bit like a spreadsheet. Each table is made up of columns, each of which has a name, and rows which contain data. Views are created for the convenience of those writing reports. If you are still baffled by that do not worry – think of each View as a convenient way for you to generate reports.

Getting Started

You’ll need to get hold of the Serio Developer Toolkit for Crystal. You can get this from Serio if you ask – it is free. Follow the instructions in the Kit to create the Views I will use in this tutorial series. Don’t be put off, as it takes no more than 20 minutes to create the Views (even if you are a novice).

Tools for creating and running queries

Microsoft SQL Server: Use the SQL Query Analyser. You can access this from the SQL Server client tools, installable from the SQL Server CD (make sure you install the service packs available from Microsoft). Ask your DBA for assistance in set-up and installation.

Oracle: SQL*Plus. Yes I know it’s a fairly primitive tool, it is it shipped with all Oracle systems. It will be on the Oracle CD. Ask your DBA for assistance in set-up and installation.

Documentation on the Views is available with the Developer Kit. Look in the ‘Schema.xls’ document, and click on the Views worksheet.

Creating our first query

For the first query, I’m going to use the very convenient View sv_issue_basics. This View contains information about every Incident, Problem and Change ever logged in your Serio system.

The Task

Create a query that lists the reference number, date of logging and priority for each Incident logged between 1st October 2006 and 31st October 2006.

To read data from the database, we need to issue a SELECT statement. Rather than explain SELECT I’m simply going to write a statement, and then refine it.

select issue_no, issue_logging_date_time, issue_priority

from sv_issue_basics

It’s probably not a good idea to run this query just yet. We have not specified that we are only interested in data from October – so this query will return information about each Incident, Problem and Change we’ve ever logged! What we need is to specify a condition, and you do that as follows:

select issue_no, issue_logging_date_time, issue_priority

from sv_issue_basics

where issue_logging_date_time >= '2006-10-01'

and issue_logging_date_time <= '2006-10-31'

SQL query

Figure 1 - Query being executed on SQL Server 

I said that sv_issue_basics contains information about all Incidents, Problems and Changes, but we are not telling the database that we only want Incident data. Looking at the documentation for sv_issue_basics, we can see a column called ‘type_code’ that ‘equals i for an Incident, p for a Problem and C for a Change’. This looks like a perfect way to just deal with Incidents, allowing me to add a clause as follows.

select issue_no, issue_logging_date_time, issue_priority

from sv_issue_basics

where issue_logging_date_time >= '2006-10-01'

and issue_logging_date_time <= '2006-10-31'

and type_code = 'i'

If you run this query, you’ll see that it now returns just Incident data, but you’ll probably find that the data returned is unsorted – because we have not told the database to sort it before returning to us. The following statement applies a sort:

select issue_no, issue_logging_date_time, issue_priority

from sv_issue_basics

where issue_logging_date_time >= '2006-10-01'

and issue_logging_date_time <= '2006-10-31' a

nd type_code = 'i'

order by issue_no

This shows the 4 most important parts (but not the only parts) of a SQL statement: select, from, where and order.

We will continue this later, and welcome feedback on this.

Wednesday, 22 Nov 2006

Commentator Peter asks if I can expand on the subject of ‘cost per lost production hour’ in this post on Availability – which is what I intend to do in this post.

Firstly let me say I don’t consider myself an expert in this, having only had to address this once in my career. What I will do is discuss some of the main factors you I believe should consider – if you can think of others, you should add them to the comments below.

Firstly, let’s recap on what we are talking about and why it is important. The subject is the cost of lost production, or the costs of unavailability. Put simply, if you provide a Key Service X to users, and then X becomes unavailable for whatever reason, what is the cost to the enterprise for each hour that X is down? Helpdesk and Service Desk Managers are interested in this because costs accruing to the business are important and legitimate areas of reporting.

So, how do we go about deciding a cost?

Starting with the obvious, costs will be different for different systems or services. I can’t imagine a scenario where a blanket cost would be applied across all systems.

We then need to agree the costs with our key customers and project sponsors – they may have strong opinions on the costs of downtime. Getting agreement with customers is important because they will have much greater faith in the statistics if they have helped to form them.

Here are some of the factors I think you should consider. The emphasis you put on these factors will depend on the service and your own particular circumstances.

Costs of lost business

Some systems (such as sales order processing systems) are directly linked to the ability of the organisation to undertake profitable business. For systems like this, we can determine how much profit we make per hour, and use this as a cost to the enterprise for unavailability per hour. Sometimes seasonal factors may affect this, but I’d advise avoiding over-complication, and advise that you ‘take things in the round’.

Cost of lost reputation

Sometimes the enterprise can, for a period, hide or limit customer exposure to downtime by reverting to manual procedures. In other cases this is not possible, and downtime leads to customer disdain. For an enterprise that trades on its reputation, this can be disastrous. Therefore, we can sometimes estimate a cost for lost prestige in the event of downtime. As you would expect, this is always going to be a value judgement, and may be a contentious issue.

Cost of penalties

Some organisations face financial penalties in the event of downtime. We should always consider these, and it’s usually quite easy to do so as they are defined as part of a contract.

Cost of lost worker hours

We may have groups of workers who need a system to do their jobs – engineers and architects are two groups that spring to mind with CAD-type systems. For these types of workers and others like them, unavailability is the difference between contributing something meaningful to a project and standing around the water cooler talking about last night’s football. In these cases, take a view on the average number of concurrent users for the system, and then average the salary cost-per-hour of a typical worker. Then increase the salary cost per hour by about 1/3rd – this allows for holiday, office space and other costs associated with staff. Take this figure and compute a cost as follows:

Cost per hour = (Average salary cost per hour + (Average salary cost per hour*.33)) X Average Concurrent Users

If you are using Serio, use the User-based downtime reports – store your average number of concurrent users in the CMDB.

Recovery costs

In some cases, after a period of downtime (particularly an extended period) the enterprise has costs arising from recovery – paying staff to work overtime to clear backlogs, inability to follow more profitable business. Whilst this can be very difficult to determine, again take a rounded view and arrive at some easy-to-use estimates.

Monday, 20 Nov 2006

Commentator Robert asks for a ‘page where all the [availability] posts are joined together’. This post is to provide just that.

The first post is an introduction post: ‘Using Serio to obtain Availability statistics’. In short, this post:

  • Defines Availability
  • Points to the White Paper we have on the subject
  • Asks you to think about what your Key Services are
  • Prompts for how Availability data should be presented

Next comes ‘More on Availability statistics’ (not a very imaginative title I know). This post:

  • Discusses identifying what your target for availability should be
  • Asks you to think about how Key Services will be represented in the CMDB

This was followed by ‘Accessing Availability Statistics’. This post:

  • Describes how to use Service Level Agreements (SLAs) with the CMDB
  • How to log Incidents that will deliver the data we need
  • Describes what ‘ingredients’ are used to produced the final Availability graphs

Next I looked at ‘Availability & the Performance Graphs’. This post:

  • Introduced & named the main graphs we use for accessing downtime statistics
  • Looked at formulae for how downtime and availability
  • Discussed the criteria you need to supply to the Performance Graphs

The final post was ‘Availability Reporting Round-up’. This post:

  • Continued the examination of the Performance Graphs, in particular looking at the ‘User-based’ graphs
  • Discussed (briefly) assessing costs associated with lost production.

Thursday, 16 Nov 2006

This is my final post (for the time being) on the subject of Availability reporting. I’ve posted quite a bit about this recently – the previous post in the series is Availability & the Performance Graphs (this has links to the other posts).

Recall from my previous post I listed the Availability graphs, but left discussing them for this post. The graphs I mentioned were

  • Downtime – Item (Monthly)
  • Downtime – Item (Monthly User Based)
  • Downtime – Item (Weekly)
  • Downtime – Item (Weekly User Based)
  • Downtime – System
  • Downtime – System (User Based)

These reports simply show (on a weekly or monthly basis) the total amount of downtime. As you can see, there are broadly two types: those that are ‘User-based’ and those that are not user-based (I’ll call these reports ‘straight downtime‘ reports).

Starting with the straight downtime reports, these just total-up the amount of downtime over the given period. If you’ve had 4 hours downtime in August, that is what the report will show.

The User-based reports do something different. Like the straight reports, they take the amount of downtime that has occurred in the preceding period. However, it is then multiplied by the number of concurrent users for the Key Service in question (this information is taken from the CMDB).

Example: You have a warehousing system that has 30 concurrent users across three sites. This Key Service in a one-month period experiences 2 hours downtime. The User-based downtime reports would show downtime as 2 x 30 = 60 hours.

If the User-based reports sound strange, then here is the intent. You can use them to assess the costs of downtime, because the reports show the amount of lost production hours. As part of your SLA you might agree the cost of a single ‘Lost Production Hour’ for a given Key Service, and from this use the User-based reports for downtime financial reporting.

Coming to a reasonable value for a lost production hour is beyond the scope of this post, but normally it will include an averaged salary value for the users concerned, and may also include a measure for the fact that profitable enterprise has been also lost during downtime – a double whammy for the organisation.

My personal opinion is that the User-based downtime reports are the most useful – they focus attention on the effect of downtime and unavailability on the organisation. I have to comment though that sometimes there is resistance to using these reports in IT departments, because the numbers generated can be very large indeed. However, that should not preclude their use in a properly managed Service Desk or Helpdesk.

Friday, 10 Nov 2006

This is my penultimate post on the subject of Availability statistics. The earlier post on this was Accessing Availability Statistics.

If you’ve read an understood everything up to this point (then give yourself a gold star) you’ll want to know how you can access Availability data.

This information is available through the SerioClient Chapter called ‘Performance’ (click the link underneath ‘Tools’). If you open a Performance Chapter you’ll see the following graphs that you can use:

  • Availability – Item
  • Availability – System
  • Downtime – Item (Monthly)
  • Downtime – Item (Monthly User Based)
  • Downtime – Item (Weekly)
  • Downtime – Item (Weekly User Based)
  • Downtime – System
  • Downtime – System (User Based)

All of these graphs take the ingredients I’ve mentioned before – the Availability SLA, and the Incidents logged against the Item or System – and compute your downtime data for you. All you need to have is accurate Incident records.

Let’s look at these Graphs more closely.

Availability Item, Availability System

These reports express Availability as a percentage over a monthly date range, going back from the report start date a total of 6 months and in so doing allowing the trend of Availability to be discovered. For each month, and for each Item or System , Serio computes the maximum possible Availability based on your SLA in minutes (MAXh), and then computes the amount of downtime (DOWNh) . It then expresses downtime (or more accurately, Availability) as a percentage using this formulae

PCT = ((MAXh – DOWNh) / MAXh) * 100

for each Item or System. As criteria for the graph, you need to supply two items of information:

Impact – So that the calculation of DOWNh is made solely on the basis of Incidents that are related to system unavailability.

Item Type Category (for the Availability Item graph). This is so you can restrict your graph to particular classes of Item (for instance ‘Enterprise servers’).

This style of graph has an interesting side effect – it can make what would otherwise be considered poor availability look quite good.

I’ll complete this post early next week by covering the Downtime reports, and what ‘User based’ means.

In the meantime, have a peaceful weekend, if that’s your thing. Otherwise have a riotous weekend.

Wednesday, 08 Nov 2006

Are you getting the most out of your Serio Knowledgebase? The Serio Knowledgebase is an excellent resource for assisting your Helpdesk/Service Desk in resolving Incidents. Here are a few tips to help you use it more effectively.

Finding Incidents with the Knowledgebase

The Knowledgebase is a great tool for locating Incidents when you remember what they are about, or roughly when they occurred, but can't seem to find them when searching under the appropriate problem area category or customer.

To find Incidents with the Knowledgebase,

  1. Choose 'Manage Incidents' from the SerioClient menu.
  2. Switch to the 'Knowledgebase Query' tab page.
  3. In the Search Phrase box, enter a word or phrase that you think will identify the Incident you are looking for (e.g. Print Problem). Click on 'Search Now!'.

Narrowing Down your Search

If you get too many results, there are lots of ways that you can narrow down your search.

  1. You can put your search phrase into double quotation marks, e.g. "print problem". This restricts Serio to searching for Incidents which contain the exact phrase print problem in their Description or Action History. Without the double quotation marks, Serio can match Incidents that contain either the word print or the word problem.
  2. You can use the keywords and and not in your search phrase. For example, the query print and problem will only match Incidents that contain both words, while the query print and not problem will match Incidents that contain the word print but not problem.
  3. You can narrow your search with criteria. For example, to restrict to searching for Incidents logged in June 2006:

i. Choose 'Incident/Problem/Change' from the 'Search for' drop-down menu, then select 'Incident' from the 'Equal to' drop-down menu, and click the 'Add' button.

ii. Choose 'Month' from the 'Search for' drop-down menu, then select 'Jun' from the 'Equal to' drop-down menu, and click the 'Add' button.

iii. Choose 'Year' from the 'Search for' drop-down menu, then type 2006 in the 'Equal to' field, and click the 'Add' button.

Having refined your search, click 'Search Now!' again.

Choosing Incidents to Work on

Now you have a list of Incidents that you are interested in, how do you start working on these?

  1. Right click on the list of Knowledgebase articles that Serio has found (left pane).
  2. Choose 'Add Issue to Select List' or 'Add All' to select the Incidents you want to work on.
  3. Click on 'Find Incidents'.

There's another good post over at Verso on the subject of Serio and the Service Catalog. This links into George's posts below about Availability statistics (where the subject of representing Key Services in the CMDB is discussed).

Tuesday, 07 Nov 2006

Monday, 06 Nov 2006

This post continues the topic of obtaining Availability statistics through Serio. The previous posts were ‘Using Serio to obtain availability statistics’ and ‘More on Availability statistics’.

To recap, I’ve said that:

  • you need to consider how you want your Availability data presented (and I gave 2 examples)
  • you need to define your Key Services
  • create a Service Level Agreement (SLA) for each of your Key Services
  • decide how you wish to represent your Key Services in the Configuration Management Database (CMDB).

Having done all of that, you now need to associate your SLAs directly with the Items (if you are using Items) or Services (if you are using Systems). This enables Serio to understand what the target Availability for the Key Service is question is (9:00 to 17:00 Mon-Fri, or 24-hours for example). If you didn’t know that you can associate SLAs directly with Items and Systems, you can – simply edit the Item or System in question and make the association directly.

We are almost ready to gather some statistics at this point, except for one thing.

Recall that I wrote about you need to be clear on what Unavailability means, or is defined as? Sometimes it is obvious (the Key Service is not functioning at all) but in other cases the service might be partly available. For instance, you might have an email system that can send emails within your organisation, but cannot send or receive them externally – does this constitute Unavailability? If you send a lot of emails to customers, or use emails for receiving customer orders, the answer is likely to be ‘yes’. Whatever the case in your organisation, have a clear definition of Unavailability.

This is important because you need to tell Serio what Incidents record Unavailability, and this is done through Impact. If you don’t have one already, create an Impact called ‘System down’. Use this Impact when record Incidents that result in Unavailability – this is how Serio filters routine Incidents from those that indicate a loss of service.

The other ingredient you need to add to the mix if the Key Service itself, represented in the Incident by either an Item or a System.

If we look at what is going into the Incident mix when you log an Incident, you can start to see how your Availability data is produced:

  • The Key Service on which we want Availability data, represented either by an Item or a System
  • The SLA for the Key Service (which we attach directly)
  • Something to tell us this is an Unavailability Incident (the ‘System Down’ Impact
  • A start date and time (when we logged the Incident) and an end date and time (when we resolve it)

I’ll look in my next post at how you use all of the ingredients above to produce meaningful statistics.

Friday, 03 Nov 2006

Over the the Verso blog there are a couple of very useful posts about an ITSM topic that is often either neglected or misunderstood - the Service Catalog. The posts are What is a Service Catalog and Getting your Service Catalog started.

Serio has an excellent repository for Service Catalog information that allows you to log Incident, Problems and Changes against services from the Catalog, and to use that as the basis for reporting (for example Availability reporting). This repository is referred to as Systems.

You can find out how to create a System by using the Administrator HowTo guide (look under Configuration Management).