Costs of Unavailability

Commentator Peter asks if I can expand on the subject of ‘cost per lost production hour’ in this post on Availability – which is what I intend to do in this post.

Firstly let me say I don’t consider myself an expert in this, having only had to address this once in my career. What I will do is discuss some of the main factors you I believe should consider – if you can think of others, you should add them to the comments below.

Firstly, let’s recap on what we are talking about and why it is important. The subject is the cost of lost production, or the costs of unavailability. Put simply, if you provide a Key Service X to users, and then X becomes unavailable for whatever reason, what is the cost to the enterprise for each hour that X is down? Helpdesk and Service Desk Managers are interested in this because costs accruing to the business are important and legitimate areas of reporting.

So, how do we go about deciding a cost?

Starting with the obvious, costs will be different for different systems or services. I can’t imagine a scenario where a blanket cost would be applied across all systems.

We then need to agree the costs with our key customers and project sponsors – they may have strong opinions on the costs of downtime. Getting agreement with customers is important because they will have much greater faith in the statistics if they have helped to form them.

Here are some of the factors I think you should consider. The emphasis you put on these factors will depend on the service and your own particular circumstances.

Costs of lost business

Some systems (such as sales order processing systems) are directly linked to the ability of the organisation to undertake profitable business. For systems like this, we can determine how much profit we make per hour, and use this as a cost to the enterprise for unavailability per hour. Sometimes seasonal factors may affect this, but I’d advise avoiding over-complication, and advise that you ‘take things in the round’.

Cost of lost reputation

Sometimes the enterprise can, for a period, hide or limit customer exposure to downtime by reverting to manual procedures. In other cases this is not possible, and downtime leads to customer disdain. For an enterprise that trades on its reputation, this can be disastrous. Therefore, we can sometimes estimate a cost for lost prestige in the event of downtime. As you would expect, this is always going to be a value judgement, and may be a contentious issue.

Cost of penalties

Some organisations face financial penalties in the event of downtime. We should always consider these, and it’s usually quite easy to do so as they are defined as part of a contract.

Cost of lost worker hours

We may have groups of workers who need a system to do their jobs – engineers and architects are two groups that spring to mind with CAD-type systems. For these types of workers and others like them, unavailability is the difference between contributing something meaningful to a project and standing around the water cooler talking about last night’s football. In these cases, take a view on the average number of concurrent users for the system, and then average the salary cost-per-hour of a typical worker. Then increase the salary cost per hour by about 1/3rd – this allows for holiday, office space and other costs associated with staff. Take this figure and compute a cost as follows:

Cost per hour = (Average salary cost per hour + (Average salary cost per hour*.33)) X Average Concurrent Users

If you are using Serio, use the User-based downtime reports – store your average number of concurrent users in the CMDB.

Recovery costs

In some cases, after a period of downtime (particularly an extended period) the enterprise has costs arising from recovery – paying staff to work overtime to clear backlogs, inability to follow more profitable business. Whilst this can be very difficult to determine, again take a rounded view and arrive at some easy-to-use estimates.