Troubleshooting Server Hangs – Part Three

In our last post on Server Hangs, we discussed using the Debugging Tools to examine a dump file to analyze pool depletion.  Today we are going to look at using our troubleshooting tools to examine a server hang caused by a handle leak.  Issues where there are an abnormal number of handles for a process are very common and result in kernel memory depletion.  A quick way to find the number of handles for each process by checking the Task Manager > Processes.  You may have to add the handles column from View > Select columns.  Generally if a process has more than 10,000 then we probably want to take a look at what is going on.  That does not necessarily mean that it is the offending process, just a suspect.  However, there are instances where the process may be for a database or some other memory intensive application.  The most common instance of this is the STORE.EXE process for Exchange Server which routinely has well over 10,000 handles.  On the other hand if our Print Spooler process has 10,000 (or more) handles then we most likely have an issue.

Once we know there is a handle leak in a particular process, we can dump out all the handles and figure out why it is leaking.  If we want to find out from a dump if there is a process that has an abnormally large number of handles, we first have to list out all the processes and then examine the number of handles being used by the processes.  To list out all the processes that are running on the box using the Debugging Tools, we use the !process 0 0 command.  This will give us an output similar to what we see below:

0: kd> !process 0 0
PROCESS 8a5295f0  SessionId: none  Cid: 0004    Peb: 00000000  ParentCid: 0000
    DirBase: 0acc0020  ObjectTable: e1002e68  HandleCount: 1056.
    Image: System

PROCESS 897e6c00  SessionId: none  Cid: 04fc    Peb: 7ffd4000  ParentCid: 0004
    DirBase: 0acc0040  ObjectTable: e1648628  HandleCount:  21.
    Image: smss.exe

PROCESS 89a26da0  SessionId: 0  Cid: 052c    Peb: 7ffdf000  ParentCid: 04fc
    DirBase: 0acc0060  ObjectTable: e37a7f68  HandleCount: 691.
    Image: csrss.exe

PROCESS 890f0da0  SessionId: 0  Cid: 0548    Peb: 7ffde000  ParentCid: 04fc
    DirBase: 0acc0080  ObjectTable: e1551138  HandleCount: 986.
    Image: winlogon.exe

PROCESS 89a345a0  SessionId: 0  Cid: 0574    Peb: 7ffd9000  ParentCid: 0548
    DirBase: 0acc00a0  ObjectTable: e11d8258  HandleCount: 396.
    Image: services.exe

The important piece of information here is the HandleCount.  For the purposes of this post, let’s assume that there is a problem with SMSS.EXE and that there is an unusually high HandleCount.  To view all of the handles for the process, the first thing we need to do is switch to the context of the process and then dump out all of the handles as shown below.  The relevant commands are:

  • .process –p –r <processaddress> – this switches us to the context of the process
  • !handle – this dumps out all of the handles
0: kd> .process –p –r 897e6c00                          
Implicit process is now 897e6c00
0: kd> !handle
processor number 0, process 897e6c00
PROCESS 897e6c00  SessionId: none  Cid: 04fc    Peb: 7ffd4000  ParentCid: 0004
    DirBase: 0acc0040  ObjectTable: e1648628  HandleCount:  21.
    Image: smss.exe

Handle table at e1674000 with 21 Entries in use
0004: Object: e1009568  GrantedAccess: 000f0003 Entry: e1674008
Object: e1009568  Type: (8a5258b8) KeyedEvent
    ObjectHeader: e1009550 (old version)
        HandleCount: 53  PointerCount: 54
        Directory Object: e10030a8  Name: CritSecOutOfMemoryEvent

0008: Object: 8910b370  GrantedAccess: 00100020 (Inherit) Entry: e1674010
Object: 8910b370  Type: (8a54c730) File
    ObjectHeader: 8910b358 (old version)
        HandleCount: 1  PointerCount: 1
        Directory Object: 00000000  Name: \WINDOWS {HarddiskVolume1}

000c: Object: e1af9828  GrantedAccess: 001f0001 Entry: e1674018
Object: e1af9828  Type: (8a512ae0) Port
    ObjectHeader: e1af9810 (old version)
        HandleCount: 1  PointerCount: 12
        Directory Object: e1002388  Name: SmApiPort

At this point we can continue to dig into the handles to determine if there is something amiss.  More often than not, this would be an issue for which systems administrators would be contacting Microsoft Support.  However, by using this method you can quickly determine whether the problem lies with a third-party component and engage that vendor directly.  Being able to provide them with a dump file that shows that their component is consuming an excessive number of handles can assist them in providing you with a quicker resolution.

That’s it for today.  In our next post on Server Hangs, we’ll look at how a lack of Available System PTE’s can cause server hangs.

- Sakthi Ganesh

