I'm asked to write today for a customer who is taking their first steps with Problem Management - in particular, I'm asked to write about practical strategies for Problem resolution. We have other Problem Management resources, such as a white paper and blog posts here, here and here.
So what do I mean by practical strategies? Something that is apparently not obvious to all engineers - I have a problem, how do I resolve it (or more accurately, get to the bottom of it)? I'm going to assume that the Problem Description is accurate and complete, and that you've got access to an accurate CMDB.
1. Use the Internet wisely.
For some Problems (but not all), engineers will be using the Internet. Searching the internet is not as straight forward as it sounds. Keep these tips in mind.
- Use different search engines. Results can vary widely, and some search engines (like Google) place less emphasis on-page factors (such as titles and content) than they do on links. Others, like live.com, place somewhat more emphasis on on page content.
So, make sure that if you don't find what you are looking for on Google, try Yahoo or live.com.
- Know how to perform advanced searching. For example, suppose you are trying to search a vendor website for an error message, but find that there is either no search mechanism on the site, or that the vendor website search mechanism returns poor results. Did you know that you can use internet search engines to search the site? (I almost never use a website's built-in search mechanism).
For instance, to search for the word kpi on seriosoft.com, you can type this into your favourite search engine:
and you will get results only from seriosoft.com with the term kpi. Similarly, to search for a phrase such as "Incident Management" you can use
site:seriosoft.com "incident management"
The site command works on Google, Yahoo and Live.
2. Maintain a list of useful web sites, organised by topic. This is a perfect candidate for a Knowledgebase document.
3. Identify the right expert forums, and participate.
Take the time to find the best expert newsgroups and forums for your particular technical discipline. Most of these are free, although a few do require a small fee for registration. This gives you a chance to discuss issues with others, and from experience more people will take the time to reply to your questions if you are seen to be helping others.
4. Always try to replicate the error.
In my experience this is something that not everyone does, but it's usually worthwhile. For some situations it's not practical, but most of the time it is. Even if the error is not repeatable that in itself is a fact worth knowing and recording.
5. Maintain test systems in advance of actually needing them.
If you do this you'll save a lot of time in trying to replicate errors and you'll be quicker in devising workarounds.
6. Make sure the Problem Management team know about groups.google.com - it's an excellent (searchable) usenet archive.