Serio Blog

Monday, 13 Aug 2007

This is another in my series of Change Management posts. The last one is here.

If you are trying to get your Change Management (CM) process ‘kick started’ you’ll probably be looking for help on getting started.

As always, if you are introducing a new ITSM process you’ll need a process owner – someone who has responsibility for the process itself, and its efficient running. This usually means a Change Manager. Having made your appointment, he or she will help to produce the basic steps in your CM process.

If you are sitting down to create a Change Management process for the first time, the best advice I can give you is to keep it simple. There seems to be something about flow charts that has an intoxicating effect on some people, to the extend that the basic Change Management process is a labyrinthine complex of twists, turns, loops and special cases. If this is what you’ve created I’d advise you to visit it again.

I’d expect to see some or all of the following

  • Initial analysis and filtering
  • Business case
  • Allocation of Priority and Initial Assessment
  • Creation of Impact Assessment, Back-out plans & Resource Estimates
  • Submission to Change Advisory Board (CAB) for authorisation, approval and comment. This is something I’m going to discuss in later posts.
  • Test Build or Implementation
  • Live Build or Implementation
  • Review
  • Update CMDB

Let’s take a look at some of these in more detail.

Creation of Impact Assessment, Back-out plans & Resource Estimates

This is a very valuable stage in any CM process. You simply examine the Change to see which services we offer to customers will be affected, or could be affected – in many cases you’ll be concerned with risk. The Back-Out plan is related to this – if things go horribly wrong, or at least don't behave as we expect, how can we roll-back or undo the Change quickly? It’s worth considering this early-on because it usually affects the things that technicians have to do prior to the Change.

In some organisations, I’ve seen these things produced during the CAB meeting. My personal preference is for as much as is possible to be produced prior to the CAB meeting for the simple reason it makes the meetings easier to chair, quicker, and less subject to diversions. You’ll certainly need this information during the CAB meeting however.

Business case

For Changes that are simply to make business improvement, you need a summary of the improvement you hope to make, along with any cost savings. If the Change is a response to one or more Incidents or Problems, you need information on the business effect (and ideally costs) of these.

Test Build

Some organisations (though by no means a majority) maintain Test Infrastructure – usually grouped together in a Test Server Room. This is where a copy, or mirror image, of live infrastructure and systems are kept – for the sole purpose of testing Changes before promotion to live systems. Although this is off my main subject, both enterprise level and desktop Changes can be handled in this way.

If you don’t have such a facility, it might still be possible (and is usually something worth doing) to perform testing on a platform specially created for the Change. In some organisations, depending on the nature of the Change, this takes the form of User Acceptance Testing.

Friday, 10 Aug 2007

The ingenuity of the home programmer is baffling writes Duncan Davidson.

Last night, I met up with a couple of friends. After talking for a while, I pulled on a flame thrower and went looking for trouble. Within a couple of minutes I’d incinerated a bunch of guys sat in a jeep, and moved on somewhere else with my ‘crew’ in tow, fanned-out in a line army style.

In case you aren’t quite with me, I was playing an online computer game – and all of my victims will live to fight another day (or until tonight anyway). The game in question (PC Halo, very popular with Serio staffers) is one that allows you to work with other players as a team, talk to them, and devise plans and strategies together. Although the media image is of the solitary gamer, modern games allow communication and interaction between real people in a way scarcely imaginable 10 years ago. My crew includes a leading Veterinary researcher, a Serio colleague, a toolshop owner, a computer programmer (me) and a Danish construction worker.

As we moved off to engage the enemy across a barren landscape, we were all picked off by single shots to the head from great distance – something that requires great skill to do.

Closer inspection revealed that our assailant was using what is called an aimbot – an aiming robot – that can be used with devastating results. Removing any elements of skill and spoiling the experience for the legitimate gamer, aimbots infect online gaming, defying efforts by companies such as Valve and others to stamp them out.

Unusually, the source code for the Halo aimbot is available online, although I’ll provide no link to it. Examining this code has taught me never to underestimate the skills of the teenage code bodger - the ingenuity shown is simply baffling.

The aimbot’s author had de-compiled the Halo game executable. I’ll explain what this means. Programmers like me use programming languages like C, C# and Delphi to write programs. These languages are human readable, use words like ‘if…’, ‘for..’ and so on. These statements (which run into many thousands of pages) are then transformed into machine code, which the computer executes. This stuff runs into millions of pages of numbers, and is unreadable to humans… or so I though.

The aimbot author had managed to work out where the game’s authors stored the position of each player. Think about that for a second. With millions of bytes, millions of numbers, he had somehow discovered the array in memory that contained all of the player details. What’s more, for each player, he had discovered what the player data meant – so he knew what direction you were facing, what your speed was…and he know how to access this from another process under Windows – no mean feat.

Computer games like Halo suffer from ‘lag’ caused by network latency – where you ‘see’ something is where it was a fraction of a second ago. To counter this, the aimbot has a system that looks at each player’s network performance and direction of travel and automatically adjusts the aim accordingly – with the impressive results I described at the start of this post.

The author of this software was, apparently, spurred by a comment from the game authors Bungie (now owned by Microsoft) that an ‘auto aiming cheat was impossible’. As I know I’m going to get hammered again tonight by this device, I wish he’d kept his mouth shut.

Wednesday, 08 Aug 2007

This is a follow-up to Monday’s post about Change Management.

Scope

First of all, I’m going to discuss the typical scope of Change Management (CM) – the types of things that might be subject to a formal Change Management process. This might include Configuration Items such as:

  • Computer hardware, such as server and desktops
  • Operating systems and other server software such as database servers
  • Routers, bridges, hubs and other types of network infrastructure
  • Documentation, both internal documentation (how to restart your CRM system for example) and end-user documentation (such as user guides and training materials)
  • Application software

Recall I mentioned in the previous post that Configuration Management and Change Management are linked? Each supports the other, but there is also a relationship in content as well. If asked what types of thing should be put into the CMDB, I usually reply ‘Items that should be subject to Change Management’. It works the other way as well – ‘All of the Items in the CMDB should be subject to CM’.

There is one other useful concept of scope – production and development. Things that are in development are, of necessity, not subject to the same control as Items that are in production (and therefore have users). If you consider these kind of development Items are subject to project Change Management -–in effect, subject you your project management disciplines before going into production.

Cost/Benefits Justifications

It can be difficult to produce objective cost and benefits analysis for Change Management (no different really from many other ITSM disciplines). The reasons are obvious – it’s impossible to quantify the value of improvements you might identify in the future, or the costs of Incidents you’ll avoid. One thing you can be sure of is you’ll pick-up additional costs as you develop a CM process, create a Change Manager role, create a Change Advisory Board (CAB), and so on.

If you are asked to justify costs, here’s an approach you can take.

First of all, set some benchmarks for the costs of introducing Change Management – in much the same way you’d cost any other project. Estimate the number of hours that will be spent, assign typical hourly costs, apply any capital expenditure, and produce a cost estimate for establishing Change Management over something like a 2 month period. I’d limit myself to the costs of getting to the point where we have a CM process for the simple reason that those costs are more easily identifiable, and that the ‘running’ costs are likely to be moderate.

Next, examine your Incident records for Incidents that have been caused by Changes that have been executed incorrectly in some way – something I’m aware that might be somewhat easier to write about and describe than it is to do. Ideally you’ll have been flagging these at the point of resolution for some time, so the ‘set’ of Incidents you’ll work with is easy to come to.

For these Incidents, estimate a very rough cost. You can do this by using downtime as a guide, and looking at one of my previous posts about the costs of unavailability.

What you can then do is compare the costs of the problem with the costs of the cure. I’ve seen someone try to quantify (in monetary terms) the value of future improvements but have never supported this approach as you may as well use a random number generator – it’s not objective in any way.

I’ll continue these posts later - either this week or next.

Monday, 06 Aug 2007

This is going to be the first in a series of posts about Change Management. I’m going to be writing for those that currently have either no Change Management process, or those that do but feel they could do things better. I’ll also be branching off from time to time to look at different aspects of the Serio tool.

Changes come about as ‘outputs’ from Incident Management or Problem Management – or quite often simply because we want to make improvements in our IT infrastructure to better serve the business.

Purpose of Change Management

The purpose of Change Management is:

  • to minimise the impact of Incidents upon the business that might occur through Change implementation
  • to have a standard, effective, repeatable way of assessing and managing the implementation of Changes to IT services and infrastructure
  • to seek business benefit and improvement

Your Change process will (most likely) have some of or all of the following:

  • an assessment of risk
  • an impact assessment
  • communication with business stakeholders and/or user communities
  • assessment of resource requirements
  • cost and benefit analysis
  • business opportunity assessment

Change Management and Configuration Management

I’ve blogged a lot about Configuration Management and the Configuration Management Data Base CMDB (search the blog for CMDB and you’ll find them). The two processes are linked. It’s something that I’ll probably come back to in more detail in later posts but if you consider for a moment something like an impact assessment, you need to ask yourself how you can do that without the level of detail and documentation that a CMDB provides. For instance, without a CMDB you may simply need to ask someone (usually a tech guru) about impact – but where does your tech guru get his information?

So, if you are thinking about implementing a Change Management process, you might also find you need to consider a Configuration Management process as well.

Prime Movers for a Change Management Process

My experience with customers has tended to show that the main driver for more formal procedures is a feeling that Changes cause too many Incidents – in other words, Changes are not planned carefully enough, tested enough, and are implemented too quickly.

It is a shame that this is a key driver because it makes it is easy to overlook Change as a vehicle for delivering business benefits, something I usually try to advise customers to ‘build-in’ to their process by having specific (usually small) focus groups that have as part of their remit to be outward looking and to be agents of innovation.

My next post will look at what should be covered by Change Management, and cost benefits/justification.

Friday, 03 Aug 2007

Are you comfortable with the idea you can play the music you’ve paid for on iTunes only on your iPod? Surely it’s time to put consumers first argues Mark James.

Digital rights management (DRM) is a big issue right now. Content creators have a natural desire to protect their intellectual property and consumers want easy access to music, video, and other online content.

The most popular portable media player is the Apple iPod, by far the most successful digital music device to date. Although an iPod can play ordinary MP3 files, its success is closely linked to iTunes’ ease of use. iTunes is a closed system built around an online store with (mostly) DRM-protected tracks using a system called FairPlay that is only compatible with the iTunes player or with an iPod.

Another option is to use a device that carries the PlaysForSure logo. These devices use a different DRM scheme - Windows Media - this time backed by Microsoft and its partners. Somewhat bizarrely, Microsoft has also launched its own Zune player using another version of Windows Media DRM - one that's incompatible with PlaysForSure.

There is a third way to access digital media - users can download or otherwise obtain DRM-free tracks and play them on any player that supports their chosen file format. To many, that sounds chaotic. Letting people download content without the protection of DRM! Surely piracy will rule and the copyright holders will lose revenue.

But will they? Home taping has been commonplace for years but there was always a quality issue. Once the development of digital music technologies allowed perfect copies to be made at home the record companies hid behind non-standard copy prevention schemes (culminating in the Sony rootkit fiasco) and DRM-protected online music. Now video content creators are following suit, with the BBC and Channel 4 both releasing DRM-protected content that will only play on some Windows PCs. At least the BBC does eventually plan to release a system that is compatible with Windows Vista and Macintosh computers but for now, the iPlayer and 4 on Demand are for Windows XP users only.

It needn’t be this way as incompatible DRM schemes restrict consumer choice and are totally unnecessary. Independent artists have already proved the model can work by releasing tracks without DRM. And after the Apple CEO, Steve Jobs, published his Thoughts on Music article in February 2006, EMI made its catalogue available, DRM-free, via iTunes, for a 25% premium.

I suspect that the rest of the major record companies are waiting to see what happens to EMI's sales and whether there is a rise in piracy of EMI tracks; which in my opinion is unlikely. The record companies want to see a return to the 1990s boom in CD sales but that was an artificial phenomenon as music lovers re-purchased their favourite analogue (LP) records in a digital (Compact Disc) format. The way to increase music sales now is to remove the barriers online content purchase.

  • The first of these is cost. Most people seem happy to pay under a pound for a track but expect album prices to be lower (matching the CDs that can be bought in supermarkets and elsewhere for around £9). Interestingly though, there is anecdotal evidence that if the price of a download was reduced and set at around $0.25 (instead of the current $0.99), then people would actually download more songs and the record companies would make more money.
  • Another barrier to sales is ease of use and portability. If I buy a CD (still the benchmark for music sales today), then I only buy it once - regardless of the brand of player that I use. Similarly, if I buy digital music or video from one store - why should I have to buy it again if I change to another system?

One of the reasons that iTunes is so popular is that it's very easy to use - the purchase process is streamlined and the synchronisation is seamless. It also locks consumers into one platform and restricts choice. Microsoft's DRM schemes do the same. And obtaining pirated content on the Internet requires a level of technical knowledge not possessed by many.

If an open standard for DRM could be created, compatible with both FairPlay and Windows Media (PlaysForSure and Zune), it would allow content owners to retain control over their intellectual property without restricting consumer choice. 

Tuesday, 31 Jul 2007

There’s an interesting post over at the Item Community forum at the moment. Contributor ITILNeutral explains how ITIL is being introduced at his company:

…A sizeable number of staff will lose their jobs (there is very little assimilation, everyone of us has to be re-interviewed for our jobs which we are totally not qualified for as far as ITIL is concerned)… ...As an example the few colleagues whose jobs are deemed already to be ITIL compatible have been assimulated but they have taken up to a £5,000 a year pay cut each...

Pretty surprising stuff. It’s unusual for management to take such drastic steps for the sake (if nothing else) of simple expediency. Management actions such as this cause uncertainty, and uncertainty is a catalyst for any organisation’s brightest and best to leave for other employment – something that can have quite devastating effects in the short and medium terms to services delivered to customers.

Although I’ve not experienced such a ‘wrecking’ approach from management I have seen something similar in my career when a former employer (and I job I enjoyed very much) ran into financial difficulties. Within 3 months the most 5 most experienced and able staff had left, rather than waiting to be ‘downsized’. My recollection is that the business suffered further as a result.

On the whole, based on what ITILNeutral has written, I’d regard the actions of managers there as plain barmy. However, it would be interesting to see what the state of IT services in the company were like, and to see if this response was some kind of backlash from a frustrated and angry business/user community. Over the years I’ve seen some pretty poor IT helpdesk/service desk operations where staff can’t be bothered to pick up the phone, cherry pick Incidents so that those that are ‘difficult’ or involve awkward customers are never attended to, and return very poor service for the investment their companies have made.

In companies like these achieving any kind of organisation change is difficult – no amount of ITIL training or role play or simulation will help. My response to this in the past has been to look carefully at team leaders – to recruit or appoint the right people, and to use them as the catalysts for change, so that rather than make cultural and service delivery changes to a team of 30, we are working with teams of something like 5 or 6. In effect, breaking the problem down into more manageable pieces and tackling resistance of apathy on the level of the team.

Generally I’ve not found it necessary to be anywhere near as confrontational.

My first ever boss told me that being a manager in IT was 'like training cats'. With that in mind, and because there seems to be a degree of irrationality in the new manager in this organisation, everyone concerned has my sympathy.

Friday, 27 Jul 2007

'Your Helpdesk/Service Desk may be closer to serving a consumer culture than you think' writes Tracey Caldwell.

IT departments are losing control over the IT used in their companies. The days of bestowing a technology solution on grateful masses seem increasingly distant. The users are revolting and bringing consumer technologies which they are finding useful outside work to the workplace. Technology news feeds and business technology blogs are just as interested in social networking and mobile telephony as they are in gigabytes of this or new versions of that.

Market research giant Gartner Group has come up with a word for this trend - consumerisation. Gartner reckons consumerisation is a catalyst for the growing conflict between the traditional enterprise IT function, which has been in sole charge of enterprise IT architecture, and the growing desire and ability of employees to influence their use of IT. IT staff may have other words for it, believing consumerisation spells disaster for compliance, security and support, and perhaps the entire IT infrastructure of their business.

Gartner has even put out a special report about it and warns businesses to change their attitudes toward consumer-led technology appearing in the enterprise from ‘unavoidable nuisance’ to ‘opportunity for additional innovation’. A bit of a surprise then that it was reported as joining a host of other commentators warning businesses off the iphone at its launch worrying about security and voice quality issues.

True, quite a few technologies that started out as consumer technologies have made an impact in corporate IT from PCs to today’s invasion of the enterprise by consumer-led instant messaging and desktop search.

As web-based companies put out beta technology and let consumers make what they will of it and work out how to make money out of it, savvy business chiefs can’t wait for the technology to mature, as they might have done once. But what consumer technology is hot and what is not? Gartner thinks it has the answers.

Apparently the next round of consumer-led innovations that are likely to have a real effect on revenue or internal spending and processes within three years include web-based application services spreading into business use, private communications channels such as email and IM being overtaken by community communication where privacy is not taken for granted, desktop videoconferencing and portable virtual machines.

Users are already showing worrying (for the Helpdesk or Service Desk) interest in running virtual environments on their PC, not least prompted by incompatibilities of new systems. Some enterprises are already looking to reflect this by implementing a virtual desktop environment as their server-based system of choice. This brings a whole host of security concerns but it looks like the bullet will have to be bitten and security concerns addressed because Gartner forecasts great things for virtualisation.

Further into the future, it thinks virtual technologies will be extended to produce augmented realities where a PC or mobile device will provide an interface and information relevant to the context of location of the user. Unboggle your minds and think of applications in plant maintenance, for example, training, computer aided surgery or PCB diagnostics.

Wednesday, 25 Jul 2007

This is a follow-up post to An Introduction to MPLS. That post tried to give some background on MPLS and described the use of edge routers and the MPLS ‘cloud’.

This post is going to talk about monitoring the service on Cisco routers. What follows works on 2600-series routers, but will also probably work on any later model Cisco router.

As with any monitoring exercise, you need to decide upfront what it is you are interested in. If it’s just ‘circuit availability’ (is the link up of down) then that’s just a simple case of configuring the routers to send LINK-DOWN traps to the Command Center. Usually though, customers are interested in more subtle things like ‘how well is the link performing?’ as opposed to just ‘is it working?’.

Fortunately there are some pretty useful things already in the Cisco Operating system to help us.

The way we’ve approached this is to set-up probes on each of the routers – probes are part of the Cisco operating system. Here is what a probe does: if you consider a pair of Edge Routers either side of the MPLS service, then the probe causes test data to be sent from one router through MPLS, detected by the other in the pair, and then echoed back again. In doing so, you can measure

  • Latency (how long it takes for the round trip)
  • Jitter (how much the round trip time varies)
  • Packet loss (did we lose any data on the round trip)

Jitter is a statistic you should only look at if you are trying to use Voice over IP (VoIP), or are sending voice-class data over your MPLS link. Latency and Packet Loss are relevant statistics however if you are just sending data.

The Cisco routers gather all of this data for you, and place it in an SNMP table you can read from the Command Center (you’ll find a MIB and Command Center Script at the end of this post). With a few simple calculations that the script performs, you can get Latency, Jitter and Packet Loss from the table.

These commands can be used to set-up the Edge Routers

rtr 1
type jitter dest-ipaddr 111.111.111.1 dest-port 2048 num-packets 1000
request–data-size 86
frequency 30
rtr schedule 1 life 2147483647 start-time now
 

where 111.111.111.1 is the address of the other router in the pair.

The default Cisco SNMP Packetsize is too small to allow the statistics table to be read. So, the following command is required:

snmp-server packetsize 8192

Calculations

The probe listed above will send an approximate 20KBPs stream, as shown below:

  • Send 86 byte packets (74 payload + 12 byte RTP header size) + 28 bytes (IP + UDP).
  • Send 1000 packets for each frequency cycle.
  • Send every packet 30 milliseconds apart for a duration of 30 seconds and sleep 10 seconds before starting the next frequency cycle.

((1000 * 74) / 30 seconds) * 8 bite per byte = 19.733 KBPs

These links on the Cisco website offer more detail

http://www.cisco.com/en/US/products/sw/cscowork/ps2144/products_user_guide_chapter09186a00800f4ec8.html http://www.cisco.com/en/US/tech/tk869/tk769/technologies_white_paper09186a00801b1a1e.shtml

and the MIB for the Cisco table is here:

rttMONMIB

and the script is here (right-click each link and 'save as...')

Monday, 23 Jul 2007

This is a post about MPLS (definition below) and a monitoring project we’ve recently helped a Command Center customer with. I’ll start by talking about MPLS in general, and post the monitoring stuff later. I'm going to assume you've never heard of MPLS.

MPLS for Dummies

It stands for Multiprotocol Label Switching. It’s a way of speeding up network traffic by avoiding the time it takes for a router to lookup the address of the next node to send a packet to. In an MPLS network data has a label attached to it, and the path that is taken is based on the label. Additionally, you can also attach a class to the data, which can be used to indicate that data has higher than usual priority.

Whilst the above takes care of the ‘Label Switching’ part of the name, the Multiprotocal part comes from the fact that Asynchronous Transport Mode (ATM), Internet Protocol (IP),  and frame relay network data can all be sent using MPLS.

For most companies, MPLS will be a service that they buy from a network services provider, and it might be beneficial to think of it thus: a pair of routers (the idea of a pair is important) on either side of an ‘MPLS cloud’. Example: you have two offices you want to link – say London and Edinburgh. You have in each office a router which interfaces with the MPLS service. When a device in Edinburgh wants to send data to a device in London it is sent via the Edinburgh router onto the MPLS service (appropriately labelled and classified by the router) where it will appear (eventually) on the London router for passing to the correct device. Between the two routers (referred to as ‘edge routers’ because they sit on the edge of the MPLS service) the data is the responsibility of the network services provider. For this reason, an MPLS service is often referred to as a cloud in network diagrams (‘we don’t know or care what happens here’).

So why bother? From the handful of customers we know using these services, the MPLS service is replacing leased lines. One of the key drivers seems to be cost – the MPLS services are working out cheaper than a leased line. However, another driver seems to be the desire to offer new services to users, one of which is Voice over IP (usually shortened to VoIP).

MPLS can be a sound (heh) choice for VoIP because of the idea of prioritising and classifying data. For VoIP to work, packets need to be sent quickly and at a relatively stable speed – otherwise, you get distortions on the line. Therefore, MPLS will offers the promise of ‘first class mail’ data packets (voice) and ‘2nd class mail’ (data) over the same network path (data is less sensitive to speed of transmission and variance of speed of transmission).

MPLS links:
MPLS Seminar notes – A pretty good introduction
Wiki – not for the faint hearted

I’ll post the monitoring details later in the week.

Friday, 20 Jul 2007

Jim emails ‘Put this one in the blog if you like, but edit out the numbers and change my name. I have to keep track of a 000’s of computers that are on customer sites (we don’t own the equipment need to know where it is). The trouble is our engineers move things and change things without telling me. We have a procedure but it’s routinely not followed. What can I do or should I start looking for another job?’.

Hi ‘Jim’ and thanks for emailing. It’s probably a bit early to start sending your CV out, as there are a few things you can do and suggest to make improvements.

I know from our follow-up emails that you have a Change procedure at your company – the trouble is that it isn’t being followed. A crucial part of your Change process (and indeed most such processes) is the step where the Asset Management/Configuration Management process is informed, allowing records to be brought up-to-date and re-verified.

This non-conformance leads to all sorts of problems, such as: someone phones up to report a problem with equipment which you have registered against site X, but the equipment has been moved to site Y (and subsequently cannot be found) I also know that there is a lot of movement going on each week.

The temptation might be to storm into the Service Delivery Manager’s office and demand that everyone is fired. However, it’s always best to be constructive when trying to solve problems, so I’d try to do a bit of fact-finding first.

I’d want to find out is why the Change procedures, that should notify you of all movements and Changes, are not being followed. Specifically, I’d want to find out:

  • If no-one actually knows about these procedures
  • If the Change procedures are generally known about but are not followed because of some negative perception (it might be they are perceived as cumbersome and bureaucratic)
  • If there was a general culture of apathy and non-cooperation
  • Other cultural factors at play (such as significant time pressure placed on field engineers who respond by moving from one job straight to the next)

What you discover on the fact-find will enable you to suggest solutions.

  • Procedure not understood or known > More training, better documentation
  • Procedures known but not followed > Suggests a re-drafting of your procedures, and/or greater management support and enforcement

… and so on. Be careful how you do your fact-find. Specifically, be careful of asking just managers, who often have a completely different view of the organisation from technicians and engineers.

It also might be the case you need more senior managerial support than you are getting – this is a really vital ingredient. Assuming that any issues with your working practices themselves have been resolved, you sometimes need senior managers to remind people that compliance is mandatory, and not optional.

Pages