Serio Blog

Wednesday, 25 Jul 2007

This is a follow-up post to An Introduction to MPLS. That post tried to give some background on MPLS and described the use of edge routers and the MPLS ‘cloud’.

This post is going to talk about monitoring the service on Cisco routers. What follows works on 2600-series routers, but will also probably work on any later model Cisco router.

As with any monitoring exercise, you need to decide upfront what it is you are interested in. If it’s just ‘circuit availability’ (is the link up of down) then that’s just a simple case of configuring the routers to send LINK-DOWN traps to the Command Center. Usually though, customers are interested in more subtle things like ‘how well is the link performing?’ as opposed to just ‘is it working?’.

Fortunately there are some pretty useful things already in the Cisco Operating system to help us.

The way we’ve approached this is to set-up probes on each of the routers – probes are part of the Cisco operating system. Here is what a probe does: if you consider a pair of Edge Routers either side of the MPLS service, then the probe causes test data to be sent from one router through MPLS, detected by the other in the pair, and then echoed back again. In doing so, you can measure

  • Latency (how long it takes for the round trip)
  • Jitter (how much the round trip time varies)
  • Packet loss (did we lose any data on the round trip)

Jitter is a statistic you should only look at if you are trying to use Voice over IP (VoIP), or are sending voice-class data over your MPLS link. Latency and Packet Loss are relevant statistics however if you are just sending data.

The Cisco routers gather all of this data for you, and place it in an SNMP table you can read from the Command Center (you’ll find a MIB and Command Center Script at the end of this post). With a few simple calculations that the script performs, you can get Latency, Jitter and Packet Loss from the table.

These commands can be used to set-up the Edge Routers

rtr 1
type jitter dest-ipaddr 111.111.111.1 dest-port 2048 num-packets 1000
request–data-size 86
frequency 30
rtr schedule 1 life 2147483647 start-time now
 

where 111.111.111.1 is the address of the other router in the pair.

The default Cisco SNMP Packetsize is too small to allow the statistics table to be read. So, the following command is required:

snmp-server packetsize 8192

Calculations

The probe listed above will send an approximate 20KBPs stream, as shown below:

  • Send 86 byte packets (74 payload + 12 byte RTP header size) + 28 bytes (IP + UDP).
  • Send 1000 packets for each frequency cycle.
  • Send every packet 30 milliseconds apart for a duration of 30 seconds and sleep 10 seconds before starting the next frequency cycle.

((1000 * 74) / 30 seconds) * 8 bite per byte = 19.733 KBPs

These links on the Cisco website offer more detail

http://www.cisco.com/en/US/products/sw/cscowork/ps2144/products_user_guide_chapter09186a00800f4ec8.html http://www.cisco.com/en/US/tech/tk869/tk769/technologies_white_paper09186a00801b1a1e.shtml

and the MIB for the Cisco table is here:

rttMONMIB

and the script is here (right-click each link and 'save as...')

Monday, 23 Jul 2007

This is a post about MPLS (definition below) and a monitoring project we’ve recently helped a Command Center customer with. I’ll start by talking about MPLS in general, and post the monitoring stuff later. I'm going to assume you've never heard of MPLS.

MPLS for Dummies

It stands for Multiprotocol Label Switching. It’s a way of speeding up network traffic by avoiding the time it takes for a router to lookup the address of the next node to send a packet to. In an MPLS network data has a label attached to it, and the path that is taken is based on the label. Additionally, you can also attach a class to the data, which can be used to indicate that data has higher than usual priority.

Whilst the above takes care of the ‘Label Switching’ part of the name, the Multiprotocal part comes from the fact that Asynchronous Transport Mode (ATM), Internet Protocol (IP),  and frame relay network data can all be sent using MPLS.

For most companies, MPLS will be a service that they buy from a network services provider, and it might be beneficial to think of it thus: a pair of routers (the idea of a pair is important) on either side of an ‘MPLS cloud’. Example: you have two offices you want to link – say London and Edinburgh. You have in each office a router which interfaces with the MPLS service. When a device in Edinburgh wants to send data to a device in London it is sent via the Edinburgh router onto the MPLS service (appropriately labelled and classified by the router) where it will appear (eventually) on the London router for passing to the correct device. Between the two routers (referred to as ‘edge routers’ because they sit on the edge of the MPLS service) the data is the responsibility of the network services provider. For this reason, an MPLS service is often referred to as a cloud in network diagrams (‘we don’t know or care what happens here’).

So why bother? From the handful of customers we know using these services, the MPLS service is replacing leased lines. One of the key drivers seems to be cost – the MPLS services are working out cheaper than a leased line. However, another driver seems to be the desire to offer new services to users, one of which is Voice over IP (usually shortened to VoIP).

MPLS can be a sound (heh) choice for VoIP because of the idea of prioritising and classifying data. For VoIP to work, packets need to be sent quickly and at a relatively stable speed – otherwise, you get distortions on the line. Therefore, MPLS will offers the promise of ‘first class mail’ data packets (voice) and ‘2nd class mail’ (data) over the same network path (data is less sensitive to speed of transmission and variance of speed of transmission).

MPLS links:
MPLS Seminar notes – A pretty good introduction
Wiki – not for the faint hearted

I’ll post the monitoring details later in the week.

Friday, 20 Jul 2007

Jim emails ‘Put this one in the blog if you like, but edit out the numbers and change my name. I have to keep track of a 000’s of computers that are on customer sites (we don’t own the equipment need to know where it is). The trouble is our engineers move things and change things without telling me. We have a procedure but it’s routinely not followed. What can I do or should I start looking for another job?’.

Hi ‘Jim’ and thanks for emailing. It’s probably a bit early to start sending your CV out, as there are a few things you can do and suggest to make improvements.

I know from our follow-up emails that you have a Change procedure at your company – the trouble is that it isn’t being followed. A crucial part of your Change process (and indeed most such processes) is the step where the Asset Management/Configuration Management process is informed, allowing records to be brought up-to-date and re-verified.

This non-conformance leads to all sorts of problems, such as: someone phones up to report a problem with equipment which you have registered against site X, but the equipment has been moved to site Y (and subsequently cannot be found) I also know that there is a lot of movement going on each week.

The temptation might be to storm into the Service Delivery Manager’s office and demand that everyone is fired. However, it’s always best to be constructive when trying to solve problems, so I’d try to do a bit of fact-finding first.

I’d want to find out is why the Change procedures, that should notify you of all movements and Changes, are not being followed. Specifically, I’d want to find out:

  • If no-one actually knows about these procedures
  • If the Change procedures are generally known about but are not followed because of some negative perception (it might be they are perceived as cumbersome and bureaucratic)
  • If there was a general culture of apathy and non-cooperation
  • Other cultural factors at play (such as significant time pressure placed on field engineers who respond by moving from one job straight to the next)

What you discover on the fact-find will enable you to suggest solutions.

  • Procedure not understood or known > More training, better documentation
  • Procedures known but not followed > Suggests a re-drafting of your procedures, and/or greater management support and enforcement

… and so on. Be careful how you do your fact-find. Specifically, be careful of asking just managers, who often have a completely different view of the organisation from technicians and engineers.

It also might be the case you need more senior managerial support than you are getting – this is a really vital ingredient. Assuming that any issues with your working practices themselves have been resolved, you sometimes need senior managers to remind people that compliance is mandatory, and not optional.

Wednesday, 18 Jul 2007

This is the final post in the series about Knowledge Base content and design, and follows on from my earlier post on the subject. The earlier post expanded the quality system, this post will give an example for WidgetCo. Hopefully this will help bring all of the previous posts together!

WidgetCo Knowledge Base Quality System

Editor’s Brief

Editor: George Ritchie
Catalog Name: Desktop KB
Catalog Accepted Formats: HTML, PDF
Audience: Service Desk Staff, 2nd Line Support Technicians
Catalog Description: Contains both Incident Resolution and How-To documents for Desktop-based computers running XP and Vista.
Examples of subjects that can be covered:
Subjects covered might include resolution of common display driver problems
Troubleshoot VPN connection problems
How-To roll out a laptop from one of our ghosted images
Resetting a user password
 

Routine tasks for the Editor

Weekly: Check for new Document suggestions. These will either be emailed suggestions to you, or Incident resolved in the last week and flagged with ‘KB suggestion’. From these, produce a list of candidate new documents. Describe the document with either an Incident reference number, or a paragraph describing the subject.

Weekly: Check for document feedback through our feedback mechanisms. Make sure each respondent receives an acknowledgement where contact details are provided.

Monthly: Review the topics (search terms) that have been submitted to the Knowledge Base Engine. Look for topics that are not covered (or adequately covered) and use this to produce a list of candidate new documents.

Monthly: Send an email to all potential users giving a title and link to each new document created.

Yearly: Each document should have a ‘Review Tag’ at the bottom – for example, REVIEW2007. This is the point at which the document is to be reviewed. As reviews are performed just once a year, this kind of tag will work fine. Simply use the Knowledge Base search facility to locate documents tagged REVIEW2007, review the content, and then update the tag to say REVIEW2008 – and so on. Reviews should check documents for accuracy and relevance.

Procedures for adding a new Document

Prior to creating the document:
Check that the document is not already included in the Catalog, or any other possible Catalog. Have the document peer-reviewed by a member of the 3rd line support team if required.
Ensure that the document is accurate and on topic.
 

Reporting and KPIs

A monthly report should be submitted to the Service Delivery Manager detailing:
The number of documents created that month
Number of queries performed in total by consumers that month
Summary of user feedback
Any other issues affecting search relevance
 

Monday, 16 Jul 2007

This is a follow-up to my last post Designing a KB Quality System. In this post, I’m going to give a more detailed description of what is included.

Recall in the last post I said we wanted

  • Accuracy
  • Relevance (conformance with the Editor’s Brief)
  • Non duplication of content
  • Feedback mechanisms
  • Periodic review
  • A way for new documents to be suggested

The first 3 of those mean that it’s unlikely you’ll have a system where anyone can create a document and just add it (what I call a dumpster Knowledge Base). Of course, you could say that contributors have to check these things before adding, but this is likely to be done with just varying degrees of success – and the bigger the team the more problematic it will usually become. (As an aside, I’m not a big fan generally of anything being owned by ‘the team’).

Instead, you will have a small number (ideally 1) of Editors who will check accuracy, relevance and uniqueness before adding. There are two ways you can use Editors:

  • The Editor writes all documents based on suggestions from colleagues or content consumers
  • Or, The Editor checks documents written by others before inclusion.

My experience is that although the second option offers the prospect of more and better content, in practice you’ll need someone to lead the process and will probably end up doing the first.

You need some form of feedback mechanism. For Serio users, this will usually mean you enable the ‘rate this document’ functionality, which allows SerioWeb users to comment on documents for you, or you simply have a special ‘feedback’ email address that goes direct to the Editor. The more technical your audience, the more likely they are to report technical errors in my experience.

Periodic review does just what it says. You need to review documents every once in a while because technology changes. For example, an incompatibility between two products might be resolved, prompting either removal or updating of the document.

Finally, you need a way for content users to suggest new documents. Now, there are two ways to do this.

  • Look at what consumers are searching for.
  • Or, Ask for suggestions directly.

Looking at what search terms are being submitted, and what results are returned, is an essential activity. If you are a Serio user, please note that logging of this data is OFF by default – switch it on and you’ll be able to see all the search terms being targeted by your consumers (see ‘Monitoring search terms used’ in the HowTo guide). This will help you identify weak areas and suggest areas for improvement.

If, as an editor, you just say ‘suggest some articles!’ you probably won’t get much of a response. Instead create a simple & structured environment. For Helpdesks and Service Desks, this usually means a way to ‘flag’ Incidents at the point of closure so as to say ‘a knowledge base article for this is required’.

For Serio users, this usually means using Agent Status B (set to something like ‘KB Suggestion’) and a question like ‘Should this Incident be suggested for Knowledge Base inclusion Yes/No?’ as part of the resolution Action. All the Editor need to do is scoop these up once in a while, and decide based on the Editor’s Brief if it should be included or not.

I’ll take all this in my next post and construct a worked example.

Thursday, 12 Jul 2007

I’ve been blogging recently about Knowledge Base content. Before proceeding, I’m going to do a quick summary of what I’ve said so far.

Content is the important thing – if you don’t make a real effort to get good, useful and relevant content you are wasting your time.

Think carefully in advance about your content. Group related content into a small number of Catalogs, and then document each Catalog. Describe the target audience, and the type of documents you’ll be creating. Create a small number of example documents that will show the style and layout to be used. All of this documentation will become the Editor’s Brief.

When designing your example documents, take a little time to help your Indexing/Search system. Find out how you can help it really understand what the document is about, and then use this in your document structure.

I have a personal preference for short-ish, single subject documents. Decide if these are the kind of documents you want. Try to decide a standard document format (HTML, Word etc) and stick to it for each Catalog, and decide how one document will reference another.

Everything above will come together into an Editor’s Brief. Such documentation is a great thing to have because it will help your content stay focused over the months as your content increases. The Editor’s Brief is also useful to searchers in that it helps them understand what is likely to be in the Catalog.

If you have an Editor’s Brief, it follows that there must be an Editor somewhere, which leads me nicely onto the subject of a Quality System for your Knowledge Base content – something you are likely to need from day one.

Here’s what the Quality System will need to ensure:

  • That the documents placed into the Catalog are technically accurate
  • The documents are in accordance with the Editor’s Brief
  • New documents being added are ‘unique’ (in other word, there is not already a document that addresses the same subject matter)
  • That consumers (searchers) can give feedback, and that the feedback will be read and if needed acted upon by editors
  • Providing a mechanism for periodic review of documents.
  • That a simple mechanism exists for suggesting new documents or content

I know this sounds a bit bureaucratic, but in practice it usually works out to be a common sense approach. I'll expand on this in my next post, and will post and example quality system. 

Tuesday, 10 Jul 2007

I don't have much money to spare, and I wish the banks would make it a little harder for someone else to get what I do have writes Mark James. A few weeks back, I read a column in the IT trade press about my bank’s botched attempt to upgrade their website security and I realised that it’s not just me who thinks banks have got it all wrong… You see, the banks are caught in a dilemma between providing convenient access for their customers and keeping it secure. That sounds reasonable enough until you consider that most casual Internet users are not too hot on security and so the banks have to dumb it down a bit. Frankly, it amazes me that information like my mother’s maiden name, my date of birth, and the town where I was born are used for “security” – they are all publicly available details and if someone wanted to spoof my identity it would be pretty easy to get hold of them all! But my bank is not alone in overdressing their (rather basic) security – one of their competitors recently “made some enhancements to [their] login process, ensuring [my] money is even safer”, resulting in what I can only describe as an unmitigated user experience nightmare. First I have to remember a customer number (which can at least be stored in a cookie – not advisable on a shared-user PC) and, bizarrely, my last name (in case the customer number doesn’t uniquely identify me?). After supplying those details correctly, I’m presented with a screen similar to the one shown below: So what’s wrong with that? Well, for starters, I haven’t a clue what the last three digits of my oldest open account are so that anti-phishing question doesn’t work. Then, to avoid keystroke loggers, I have to click on the key pad buttons to enter the PIN and memorable date. That would be fair enough except that they are not in a logical order and they move around at every attempt to log in. This is more like an IQ test than a security screen (although the bank describes it as “simple”)! I could continue with the anecdotal user experience disasters but I think I’ve probably got my point across by now. Paradoxically, the answer is quite simple and in daily use by many commercial organisations. Whilst banks are sticking with single factor (something you know) login credentials for their customers, companies often use multiple factor authentication for secure remote access by employees. I have a login ID and a token which generates a seemingly random (actually highly mathematical) 6 digit number that I combine with a PIN to access my company network. It’s easy – and all it needs is knowledge of the website URL, my login ID and PIN (things that I know), together with physical access to my security token (something I have). For me, those things are easy to remember but for someone else to guess… practically impossible. I suspect the reason that the banks have stuck with their security theatre is down to cost. So, would someone please remind me, how many billions did the UK high-street banks make in profit last year? And how much money is lost in identity theft every day? A few pounds for a token doesn’t seem too expensive to me. Failing that, why not make card readers a condition of access to online banking and use the Chip and PIN system with our bank cards?

Friday, 06 Jul 2007

In my previous post, I talked about issues relating to document design for knowledge base content.

In this post I’m going to talk about optimising your document for indexing. Whilst what I’m writing applies directly to Serio users, my guess is this holds true for most content-retrieval systems.

The most import thing in a knowledge base is the content – as I’ve written before. To have something that works well (particularly with enterprise content) you don’t always need a retrieval system – for example, you could simply set-up a few web-pages that organise your content in a directory form.

However, as the number of document grows, the need for a search mechanism increases. This search mechanism often takes the form of keyword of phrase searching – pretty much like internet search engines do – returning results based on relevance.

For example, Serio uses Microsoft Indexing Service, which is a standard part of the Windows family of operating systems. When you create content, and place it into the Catalog directory you’ve created, Indexing Service ‘behind the scenes’ reads the document and indexes it. When you search for something, it uses the indices it has created to produce your list of results.

When you create your content, you can give clues to the Indexing Service as to what the document is about – not all words on the page are treated equally. The following should help the Indexing Service to return the best documents to your consumers – but you need to have an idea of keywords. By keywords, I mean the phrases or words your knowledge base consumers might use to locate your document.

  • Title. Don’t use a standard title, create documents that have an expressive, relevant titles that contain some of the keywords you think searchers will use.
  • Bolding. Bolding of certain words seems to be a factor in assigning importance to some words to over others. Use bolding intelligently by bolding key words and phrases.
  • Location of words. Keywords near the top of the document again can help some documents rank over others for certain phrases.

Wednesday, 04 Jul 2007

I recently contacted a customer – an Incident Manager – whom I'd spent some time helping with his Incident Management procedures. He told me that things were going fine, but that some staff were not calling customers back within the 1 hour target of logging Incidents set down in the SLA (or not recording they had done so), and so he'd needed to sent a couple of emails.

I was strangely pleased to hear this, as it meant he was acting as an Incident Manager should – using the weekly checklist we'd drawn up to monitor that procedures are followed and following up where they aren't.

In an earlier post my colleague George asked "So What do Incident Managers do all day?".

An Incident Manager must ensure that the Incident Management process is documented, so that service delivery staff know what it is. They must also take charge of making sure that staff are aware of these procedures for example, through training, workshops, role plays, meetings, and so forth.

But all of this effort is pointless if no-one is actually following the procedures, so as Incident Manager you need to be checking that they are.

Following a weekly Incident Manager's checklist is a simple way of ensuring that such checks are systematic and regular. In addition, checklists provide important evidence that your Incident Management processes are working. In other words, they can be a Key Performance Indicator (KPI), with which you can supplement management reports to back up your recommendations for changes. (See "Some Service Level Management (SLM) Key Performance Indicators (KPI)").

So what sort of things would you include in an Incident Manager's Weekly Checklist? To be fair, you obviously must base your checklist on the documented Incident Management procedures you are providing to your staff (for example, your Operations Manual). After all, how can you expect people to follow a procedure you've never written down?

Clearly, procedures will vary from one company to another, but the following checks are typical from my experience, and will get you started with your own list. (Tip: For the benefit of yourself and other readers, I'd recommend you format your own list as a checklist, with tick boxes and spaces for your comments):

1. Take a sample of Incidents logged by different agents during the last week. Check the quality of the Incident details logged:

  • Have all the details listed on your call logging script been captured?
  • Has the correct priority been assigned?
  • Has the correct Configuration Item (CI) been recorded?
  • Is the description of good quality? (You need to break this down further by looking at whether standard questions were asked – version number? error message? - if appropriate description templates were used, if multiple Incidents are being described in one Incident, etc.)
  • Was the Incident correctly assigned?

2. Take a sample of Incidents logged within the last week and assigned to different agents. Check that your procedures have been followed with regard to responding promptly to Incidents logged:

  • Did the assigned agent respond to the customer within the target response time set out in your SLA?
  • Did they make the response in the correct manner (for example, if you have specified that agents should try phoning the customer before sending emails, have they done this?)

3. Take a sample of Incidents resolved by different agents within the last week.

  • Is there evidence that the agent has contacted the customer to notify them of the resolution?
  • Was a good quality resolution description provided, explaining clearly how the issue was solved (not 'done' or 'I fixed it').
  • Has the correct Cause code been attributed to the Incident?

4. Look at all Incidents during the last week where the resolution target was missed. In each case, try to determine the reason and note this against the Incident.

And so forth. Your own checklist might continue with checks on whether correct escalation procedure had been followed, a review of how any Major Incidents or Complaints were handled, etc.

Monday, 02 Jul 2007

This is a follow-on from my earlier Knowledge base Structure post, where I defined something called enterprise content, looked at describing your consumers, creating Catalogs, and creating some example documents that will ‘set the style’ for the content you are going to create.

This post is all about the enterprise content documents themselves – what I think works best, and some pitfalls to avoid.

Short Documents are Best

My own preference is for concise, single topic documents. In other words, don’t create documents that address a range of issues or subjects – it’s preferable to have a document that tells you how to fix a single 'something', or do something of value to the enterprise.

Sometimes content authors sit down to write content, and end up with a longer, more comprehensive document – maybe that covers a number of errors or problems, and shows you how to resolve each.

I tend not to favour this approach, and here’s why. Content consumers will be searching – either by entering a search term, or by browsing to the document from an intranet page. The consumer will want to make a quick decision about ‘Is this document right for me?’. The longer and broader your document is, the harder that decision making becomes, and the greater likelihood that your consumer will miss the solution you’ve taken time to create.

People can also be lazy. If their search yields a number of documents, the consumer may not ‘scroll down’ and instead only read the part of the document that displays ‘naturally’. This phenomenon has been known to search marketers for years, with content you have to scroll down to being referred to as ‘below the fold’ and generally regarded as much less visited.

Think About Your Format

Word, HTML, PDF – or something else? For each Catalog try to stick to a particular format. Remember Word is not the best format if your consumers are Internet visitors.

How will Documents be Linked?

Sometimes you’ll need one knowledgebase article to reference another, maybe to supply supplemental information to the consumer. If you are using HTML as your primary format, it seems natural to create a hyperlink between the two.

This seems fine – on the face of it. However, in creating that link you’ve just increased your support burden and created a dependence between the two documents. If you move the relative locations of the two documents, or rename the linked-to document, then your link no longer works.

A better approach might be as follows. Give each document an alpha-numeric reference number that appears in the title, and tell searchers to see ‘document …. for more information’. In other words, they have to re-search for the document using the reference you’ve given them. Although it is not quite so convenient for consumers, it does mean that you are free to re-organise your content from time to time without developing broken links within your content. Any changes you make will be handled by your indexing system.

Pages