Mark Russinovich’s technical blog covering topics such as Windows troubleshooting, technologies and security.
When I experienced a crash in Internet Explorer (IE) on my home 64-bit gaming system one day, I chalked it up to random third-party plug-in memory corruption. I moved on, but a few days later had another crash in IE. Then, Windows Media Player (WMP) started crashing every third or fourth time I used it:
Crashes in different programs seemed to point at a more fundamental problem. I had over-clocked the CPU, so I speculated that the rash of crashes were a side-effect of CPU overheating and reluctantly dialed back the clock multiplier to the factory specification. To my dismay, however, the crashes continued. My next theory was that I had bad RAM, but the Windows Vista Memory Diagnostic failed to identify any problems.
Hardware problems seemingly cleared, my next move was to look at the process crash dumps to see if they held any clues. But first I had to find a crash dump to look at. Windows XP’s Application Error Reporting process always generates a dump before showing you the application crash dialog, and you can find the location of the dump by clicking to see the report details and then viewing the report’s technical information:
Windows Vista’s corresponding dialog doesn’t offer a way to get at a report’s technical information and it doesn’t generate a dump unless Microsoft’s Windows Error Reporting (WER) servers request it, which they only do for crashes reported in high volumes. Fortunately, WerFault, the process that presents the dialog, keeps the crashed process around until you press the Close Program button, which offers an opportunity to attach to the process with a debugger and examine it. You can see WerFault’s handle to a crashed Windows Media Player process in Process Explorer:
The next time I had a crash, I launched WinDbg, the Windows Debugger from the Debugging Tools for Windows package that’s available for free download from Microsoft. After making sure that I had the symbol configuration set to point at the Microsoft public symbol server (e.g. srv*c:\symbols*http://msdl.microsoft.com/download/symbols) in the Symbol File Path dialog, I went to the File menu and selected the “Attach to a Process...” menu entry:
That opens the WinDbg process selection dialog, which I scrolled through to find the crashed process. When I selected the process, WinDbg opened it and presented the same interface it does when it loads a crash dump, except that when you load a crash dump, you can execute the !analyze debugger command that uses heuristics to try and pinpoint the cause of the crash; when you perform a debugger attach, an analysis will just tell you what you already know, that you attached with a debugger:
Looking for a potential cause of a crash when attached requires looking at the stack of each thread in the process, so I opened the Processes and Threads and Call Stack dialogs in the View menu:
I started examining threads by selecting the first entry in the threads dialog:
The WinDbg command window usually grays and says “Busy” as WinDbg pulls symbols from the symbol server, after which the call stack dialog populates with the function nesting of the selected thread at the time of the crash. I examined each thread’s stack in turn, moving between threads by pressing the down arrow and then the enter key, hunting for a stack that had function names with the words “exception” or “fault” in them. Near the end of the list I came across this one:
I noticed that the top of the list is full of functions with “Exception” in their names. Looking down the list (up the stack), I saw that a function in Nvappfilter called Kernel32.dll’s HeapFree function, leading to the crash. The exception in the heap’s free routines meant that either the caller passed a bogus heap address or that the heap was already corrupted when the function executed. If a Windows DLL had been the caller I would have suspected the latter, but in this case the caller was a third-party DLL, which I could tell by the fact that WinDbg couldn’t locate symbol information for it and hence didn’t know the names of the functions within it. I confirmed that by issuing the lm (list module) command to look at its version information:
Nvappfilter was now my primary suspect, but I didn’t have direct evidence that it was responsible. I continued to use the system and followed the same debugging steps on the next several crashes. Whether it was IE, WMP or a game, the faulting stack was always the same, with Nvappfilter calling HeapFree. That’s still not conclusive proof, but the anecdotal evidence was pretty compelling.
At that point I went to see if there were updates for Nvappfilter, but I wasn’t sure what software package it was associated with. I entered its name in a Web search and discovered that it’s part of the nVidia’s FirstPacket feature that prioritizes game traffic and that’s included in the nForce motherboard’s software:
I went to nVidia’s site and downloaded the most recent nForce driver package, but it failed to update Nvappfilter.dll and I continued to have the crashes.
The nVidia control panel offers no way that I could find to prevent Nvappfilter from loading, so my only recourse was to manually disable it. I wasn’t using the FirstPacket feature, which I had previously been unaware of, so I wouldn’t miss it, but first I had to figure out how it configured Windows to load it. For that I turned to Autoruns, where I found references to Nvappfilter’s 32-bit and 64-bit versions in the Winsock Layered Service Provider (LSP) section:
I deleted all of Nvappfilter’s entries, rebooted the system and have been crash-free since. While I was writing this post, I checked again for nForce software updates to see if Nvappfilter had been updated. The latest version doesn’t look like it includes Nvappfilter or any other Winsock LSP, so assuming Nvappfilter was at fault, it’s no longer an issue.
One other thing I’ve done since I investigated these crashes is take advantage of Vista SP1’s “local dumps” functionality so that I'll automatically get a crash dump to investigate for any application crash I experience. If you create a key named HKLM\Software\Microsoft\Windows\Windows Error Reporting\LocalDumps, WerFault will always save a dump. Crashes go by default into %LOCALAPPDATA%\Crashdumps, but you can override that with a Registry value and also specify a limit on the number of crashes WerFault will keep.
When I had trouble using wmplayer.exe, I did not make the association of its freezes because of a third party DLL. Windows Media Player would start, showing its user interface, and then freeze when a DVD would be inserted. I wish I used "attach to a process" when I tried WinDbg. My focus was on the high amount of private bytes, but should have realized that as an actual feature was used, a DLL would be virtually injected. If WinDbg wasn't listing the functions, It is hard to conceive of such a powerful tool as Process Explorer, chock full of case histories. Thank you, Mark.
Many of your articles read as a detective. In addition on a painful theme for me. Today I was looking biggest resources consumers on my system. It was very surprised that Windows Live Messenger use from 30 to 40 threads, it is the only program for instant messaging, while Visual Studio with ARCHSVN and Resharper addons satisfied with only 30 threads. Live Messenger retains over 800! GDI of objects, while Visio with a large scheme use them in all 400.
Programisty bad habits exist even in Microsoft, but learn from the mistakes they can become better. Could you find time to hold a lecture for the Windows Live Products department about economical use of system resources, because such impression, that they test the products on 32CPU monster with 2 TB of memory from next article?
P.S. I would give up the use of Live Messenger, however all my partners in the US use it.
This is one of those articles I have returned-to several times. Great information! I have followed your work at Sysinternals and now Microsoft and am continually amazed. Thank you for sharing your knowledge!
I had a similar problems with configuring RAID on a NVIDIA board.
For the strangest of reasons it had reverted back from RAID to Non-Raid configuration.
Found out from the dumps that one of the driver was not happy on Windows 2008 , checked on Nvid's site for WS 2008 support for RAID drivers - didnt find any.