Thoughts from the EPS Windows Server Performance Team
Useful Microsoft Blogs
(Pre-Windows Server 2008)
Description: A hang is typically defined as a condition where a machine is non-responsive over the network and\or at the console. This usually manifests itself in not being able to log onto the console or a session, or a session becoming unresponsive to input or network traffic. This is not to be confused with a crash or bugcheck, which indicates a software or kernel fault. This document is specific to instances where a machine hangs or becomes unresponsive during normal use. This does not apply to these symptoms (they are covered elsewhere):
Server hang during boot
Server hang after CTRL-ALT-DEL
Server hang at Applying Computer Settings
Server hang at Shutdown
This document applies to:
Windows 2000 Service Pack 4 with Update Rollup Package 1. (Mainstream support ended
Windows Server 2003 RTM (Mainstream support ended 3/30/2007)
Windows Server 2003 Service Pack 1 (Mainstream support ended 4/14/2009)
Windows Server 2003 Service Pack 2 (Mainstream support ends 7/13/2010)
Scoping the Issue: Define the type of hang:
1. Is the console hung or is it an issue with network connectivity?
2. Does Ctrl-Alt-Delete bring up the Windows Security dialog?
3. Can you toggle Caps Lock or Num Lock? If you can’t it could be a hardware or driver problem.
4. Can you move the mouse?
5. Is there a KVM in use?
6. When did the issue start occurring? DDMMYYYY, HH:MM:SS
7. What changed?
8. How long has the server being in production?
9. How often does the issue occur?
10. Under what conditions does the issue occur?
11. What else is going on when the issue occurs?
12. Does it happen at a particular time of day (users logging in, scheduled tasks, backup etc).
13. Is there anything you can do to make the problem occur (repro steps)?
14. Can you ping by Ip address, Netbios or Fully Qualified Domain Name?
15. Can you open network shares? Can users connect to file shares on the hung machine? Are there any errors?
16. Are you able to logon at the physical console? If so, are there any errors?
17. Are you able to logon at via Remote Desktop (RDP client)? Are there any errors?
If this is a terminal server, are you observing this behavior from a session or at the console?
18. Are you able to open Computer Management remotely? Are there any errors?
19. What do you do to recover from the hang?
20. How long have you waited before rebooting the server?
21. What have you tried to do to fix the problem?
22. If it’s not completely hung and we can get to Task Manager, check resources:
CPU time - is there a specific process pegging the CPU?
If so and its third party, if we end it what happens?
Data Gathering: One of the most useful tools in diagnosing system hangs is Performance Monitor (Perfmon) logging. Perfmon allows the user to gather performance counters for various objects relating to system health, such as: Memory, Network Interface, Physical Disk, Processor, Process, etc.
In all instances, collect:
1. MPS Reports PFE version
Microsoft Premier Services Reporting Utility (PFE version)
2. Perfmon logs should include the timeframe when the problem is happening on the system.
You can create the log parameters manually, or by using the Performance Monitor Wizard.
You should capture the logs remotely from another computer.
a. Set up the remote Binary Circular performance log grab all core OS counters
· Logical disk
· NBT Connections
· Network interface
· Paging File
· Physical disk
· Server Work Queues
The Perfmon capture interval is determined by the length of time it takes the server to go from a normal state, to a problem state.
Please gather two concurrent Perfmon logs:
b. Short interval with a 5 seconds interval.
If the average time to issue is:
The capture interval should be:
c. Long interval
Please use the table below to set the capture interval.
d. In Windows 2000, a common problem encountered when attempting to collect Perfmon logs remotely is that by default, the Performance Logs and Alerts service is started under the local computer’s “System” account. For steps on how to enable a network account to have permissions on the Performance Logs and Alerts service, please refer to Microsoft KB Article 240389: Log is not started when you try to start a log with remote counters in System Monitor.
e. In Windows Server 2003, you can simply use the "RunAs" option when setting up the counters.
3. Setup for a complete memory dump per KB 972110.
Proactively, make sure that :
Troubleshooting / Resolution:
1. In the "System Event Log" look for "Event ID 2019" and "Event ID 2020"
2. In Perfmon, check for any Process --> NameofProcess --> Handles value larger than 15,000.
Note: LSASS.exe on DC's is normal to see a value up to 50,000.
Note: Store.exe on Exchange servers is normal to see a value up to 65,000
972110 How to generate a kernel dump file or a complete memory dump file in Windows Server 2003
177415 How to use Memory Pool Monitor (Poolmon.exe) to troubleshoot kernel mode memory leaks
164933 How to allow Poolmon.exe to run by setting GlobalFlag value
Using PoolMon to Find a Kernel-Mode Memory Leak
246758 How to Monitor Performance of a Remote Computer Without Logging on to It
969639 Error message when you try to access the Performance Monitor (Perfmon.exe) on a remote computer: "Access Is Denied"
888989 A Performance Monitor counter for the Physical Disk performance object may not be displayed in Windows 2000
248993 PRB: Performance Object Is Not Displayed in Performance Monitor