Welcome to TechNet Blogs Sign in | Join | Help

Troubleshooting system stability

There are two tools I find invaluable when troubleshooting system stability. The first is sigverif.exe which checks for digital signatures on system files. The second is verifier.exe which can be used to test the stability of drivers.

I only ever use Windows Hardware Quality Labs (WHQL) signed drivers on my Windows Servers, and I'd recommend you make this organisational policy for all servers (it can be enforced through group policy). Your servers are a vital resource and you just cannot afford to have untested third party code running in kernel mode - this policy will be mandatory on future versions of Windows for the same reason.

If you do have unsigned drivers, or you suspect a signed driver is causing you problems then you can use verifier.exe to monitor that driver. It will allow you to pinpoint the driver causing stability problems but at the cost of system performance whilst the tool is in use.

If I'm asked why a server is crashing the first thing I do is run sigverif and find unsigned drivers. Then I either find suitable updates that are WHQL certified or I tell the server owner to find replacement hardware that has WHQL drivers and send the old stuff back.

If neither is possible (like when I was stuck on a sailing ship in the middle of the sea with a laptop-driven GPS system that kept crashing) I turn to verifier.exe and start monitoring the drivers on the system. This helps me identify exactly which driver is causing problem.

I also check the BIOS version and see if there are updates available - you'll be amazed at some of the things that get fixed in the subsequent releases of a BIOS.

The final step is to start checking the hardware, Windows Vista includes a new Memory Diagnostic tool that will help with memory troubleshooting. Other hardware will still require special software like PC Check, but if it's not the software and not the memory maybe it's time to think about sending it back for a replacement?

It's not always possible to know exactly what's causing a system crash, the best you can do is to try and reduce the possibilities by following best practice.

Troubleshooting Stop messages: general strategies

Published Tuesday, August 15, 2006 4:22 PM by wigunara

Comments

No Comments

Anonymous comments are disabled
 
Page view tracker