Debugging by David J. Agans
Publisher: Amacom (September 12, 2006)
In his book, Agans discusses what he refers to as “… the 9 indispensable rules …” for isolating problems. I’ll be referring to these rules in the context of being an IT Professional.
Understand the System – Debugging, Chapter 3, pg 11
In order to isolate a problem, Agans discusses the need to understand the system you’re working with. Consider the following.
Purpose – What is the system designed to do and does this match your expectation? It’s surprising how often an issue has its roots in misunderstanding the capabilities of a technology.
Configuration – How was the system deployed and does that match intentions? Do you have a test environment? If you have a test environment, you can compare “good” with “bad” or even reproduce the issue and have a safe place to experiment with solutions.
Interdependencies – This is an important thing to understand. Take the example of DFSR – where there are dependencies on network connectivity/ports, name resolution, the file system and Active Directory. Problems with these other components could surface as symptoms in DFSR. Understanding the interplay between these “blocks” and what each “block” is responsible for will greatly assist you in isolating problems.
Tools – It could be argued that tools aren’t part of the system but without knowing how to interrogate each component, you’re unlikely to get very far. Log files, event logs, command line utilities and management UIs all tell you something about configuration and behaviour. Further to this, you need to know how to read and interpret the output. Your tools might include log processing scripts or even something as obscure as an Excel pivot table.
If you don't know how the system works, look it up. Seek out every piece of documentation you can find and read it. Build a test environment and experiment with configuration. Understand what “normal” looks like.
Check the Plug – Debugging, Chapter 9, pg 107
Start at the beginning and question your assumptions. Don't rule out the obvious and instead, check the basics. More than a few issues have dragged on too long after overlooking something simple in the early stages of investigation. Can servers ping each other? Does name resolution work? Does the disk have free space?
Do your tools do what you think they do? If you have doubts, it’s time to review your understanding of the system.
Are you misinterpreting data? Try not to jump to conclusions and try to verify your results with another tool. If you hear yourself saying, “I think this data is telling me …” find a way to test your theory.
Divide and Conquer – Debugging, Chapter 6, pg 67
Rather than trying to look at everything in detail, narrow the scope. Divide the system into pieces and verify the behaviour in each area before you get too deep.
- Mark “cut it out with a scalpel” Renoden
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.