Availability Reporting Round-Up

This is my final post (for the time being) on the subject of Availability reporting. I’ve posted quite a bit about this recently – the previous post in the series is Availability & the Performance Graphs (this has links to the other posts).

Recall from my previous post I listed the Availability graphs, but left discussing them for this post. The graphs I mentioned were

  • Downtime – Item (Monthly)
  • Downtime – Item (Monthly User Based)
  • Downtime – Item (Weekly)
  • Downtime – Item (Weekly User Based)
  • Downtime – System
  • Downtime – System (User Based)

These reports simply show (on a weekly or monthly basis) the total amount of downtime. As you can see, there are broadly two types: those that are ‘User-based’ and those that are not user-based (I’ll call these reports ‘straight downtime‘ reports).

Starting with the straight downtime reports, these just total-up the amount of downtime over the given period. If you’ve had 4 hours downtime in August, that is what the report will show.

The User-based reports do something different. Like the straight reports, they take the amount of downtime that has occurred in the preceding period. However, it is then multiplied by the number of concurrent users for the Key Service in question (this information is taken from the CMDB).

Example: You have a warehousing system that has 30 concurrent users across three sites. This Key Service in a one-month period experiences 2 hours downtime. The User-based downtime reports would show downtime as 2 x 30 = 60 hours.

If the User-based reports sound strange, then here is the intent. You can use them to assess the costs of downtime, because the reports show the amount of lost production hours. As part of your SLA you might agree the cost of a single ‘Lost Production Hour’ for a given Key Service, and from this use the User-based reports for downtime financial reporting.

Coming to a reasonable value for a lost production hour is beyond the scope of this post, but normally it will include an averaged salary value for the users concerned, and may also include a measure for the fact that profitable enterprise has been also lost during downtime – a double whammy for the organisation.

My personal opinion is that the User-based downtime reports are the most useful – they focus attention on the effect of downtime and unavailability on the organisation. I have to comment though that sometimes there is resistance to using these reports in IT departments, because the numbers generated can be very large indeed. However, that should not preclude their use in a properly managed Service Desk or Helpdesk.