Mark Russinovich’s technical blog covering topics such as Windows troubleshooting, technologies and security.
When I experienced a crash in Internet Explorer (IE) on my home 64-bit gaming system one day, I chalked it up to random third-party plug-in memory corruption. I moved on, but a few days later had another crash in IE. Then, Windows Media Player (WMP) started crashing every third or fourth time I used it:
Crashes in different programs seemed to point at a more fundamental problem. I had over-clocked the CPU, so I speculated that the rash of crashes were a side-effect of CPU overheating and reluctantly dialed back the clock multiplier to the factory specification. To my dismay, however, the crashes continued. My next theory was that I had bad RAM, but the Windows Vista Memory Diagnostic failed to identify any problems.
Hardware problems seemingly cleared, my next move was to look at the process crash dumps to see if they held any clues. But first I had to find a crash dump to look at. Windows XP’s Application Error Reporting process always generates a dump before showing you the application crash dialog, and you can find the location of the dump by clicking to see the report details and then viewing the report’s technical information:
Windows Vista’s corresponding dialog doesn’t offer a way to get at a report’s technical information and it doesn’t generate a dump unless Microsoft’s Windows Error Reporting (WER) servers request it, which they only do for crashes reported in high volumes. Fortunately, WerFault, the process that presents the dialog, keeps the crashed process around until you press the Close Program button, which offers an opportunity to attach to the process with a debugger and examine it. You can see WerFault’s handle to a crashed Windows Media Player process in Process Explorer:
The next time I had a crash, I launched WinDbg, the Windows Debugger from the Debugging Tools for Windows package that’s available for free download from Microsoft. After making sure that I had the symbol configuration set to point at the Microsoft public symbol server (e.g. srv*c:\symbols*http://msdl.microsoft.com/download/symbols) in the Symbol File Path dialog, I went to the File menu and selected the “Attach to a Process...” menu entry:
That opens the WinDbg process selection dialog, which I scrolled through to find the crashed process. When I selected the process, WinDbg opened it and presented the same interface it does when it loads a crash dump, except that when you load a crash dump, you can execute the !analyze debugger command that uses heuristics to try and pinpoint the cause of the crash; when you perform a debugger attach, an analysis will just tell you what you already know, that you attached with a debugger:
Looking for a potential cause of a crash when attached requires looking at the stack of each thread in the process, so I opened the Processes and Threads and Call Stack dialogs in the View menu:
I started examining threads by selecting the first entry in the threads dialog:
The WinDbg command window usually grays and says “Busy” as WinDbg pulls symbols from the symbol server, after which the call stack dialog populates with the function nesting of the selected thread at the time of the crash. I examined each thread’s stack in turn, moving between threads by pressing the down arrow and then the enter key, hunting for a stack that had function names with the words “exception” or “fault” in them. Near the end of the list I came across this one:
I noticed that the top of the list is full of functions with “Exception” in their names. Looking down the list (up the stack), I saw that a function in Nvappfilter called Kernel32.dll’s HeapFree function, leading to the crash. The exception in the heap’s free routines meant that either the caller passed a bogus heap address or that the heap was already corrupted when the function executed. If a Windows DLL had been the caller I would have suspected the latter, but in this case the caller was a third-party DLL, which I could tell by the fact that WinDbg couldn’t locate symbol information for it and hence didn’t know the names of the functions within it. I confirmed that by issuing the lm (list module) command to look at its version information:
Nvappfilter was now my primary suspect, but I didn’t have direct evidence that it was responsible. I continued to use the system and followed the same debugging steps on the next several crashes. Whether it was IE, WMP or a game, the faulting stack was always the same, with Nvappfilter calling HeapFree. That’s still not conclusive proof, but the anecdotal evidence was pretty compelling.
At that point I went to see if there were updates for Nvappfilter, but I wasn’t sure what software package it was associated with. I entered its name in a Web search and discovered that it’s part of the nVidia’s FirstPacket feature that prioritizes game traffic and that’s included in the nForce motherboard’s software:
I went to nVidia’s site and downloaded the most recent nForce driver package, but it failed to update Nvappfilter.dll and I continued to have the crashes.
The nVidia control panel offers no way that I could find to prevent Nvappfilter from loading, so my only recourse was to manually disable it. I wasn’t using the FirstPacket feature, which I had previously been unaware of, so I wouldn’t miss it, but first I had to figure out how it configured Windows to load it. For that I turned to Autoruns, where I found references to Nvappfilter’s 32-bit and 64-bit versions in the Winsock Layered Service Provider (LSP) section:
I deleted all of Nvappfilter’s entries, rebooted the system and have been crash-free since. While I was writing this post, I checked again for nForce software updates to see if Nvappfilter had been updated. The latest version doesn’t look like it includes Nvappfilter or any other Winsock LSP, so assuming Nvappfilter was at fault, it’s no longer an issue.
One other thing I’ve done since I investigated these crashes is take advantage of Vista SP1’s “local dumps” functionality so that I'll automatically get a crash dump to investigate for any application crash I experience. If you create a key named HKLM\Software\Microsoft\Windows\Windows Error Reporting\LocalDumps, WerFault will always save a dump. Crashes go by default into %LOCALAPPDATA%\Crashdumps, but you can override that with a Registry value and also specify a limit on the number of crashes WerFault will keep.
A glance at the <a href="http://www.google.com/search?hl=en&q=+site:forums.nvidia.com+nvappfilter">NVidia forums</a> seems to show that nvappfilter.dll is associated with something called the "NVidia Firewall" (possibly aka "nam"). I don't think a video card company has any business writing firewalls.
One can only hope that someone responsible at NVidia reads this blog, or that you Mark have a contact through Microsoft to pass this information onto them... a most excellent and informative debugging journey.
BTW, if you try the LSP "fix" used in this article on a pre-Vista operating system, you'll likely lose network connectivity due to a corrupt LSP stack. Only Vista and later know how to automatically fix this...
Mark, great post.
@Ross: It's worth pointing out that nVidia now provides full chipsets to motherboard manufacturers. They are much more than just a "video card company" nowadays.
For the longest time I only used nVidia because I couldn't stand ATI drivers. Lately however, nVidia drivers are getting terrible (the new "advanced" control panel is the perfect example). I might have to re-evaluate and switch to ATI.
The Nvidia Firewall has had problems for years now. It never worked properly under XP, so I doubt if it is certified under Vista anyway.
Just uninstall it. It is in your Add/remove programs as the "NVIDIA ForceWare Network Access Manager". I never install it on any nForce motherboards. It is not worth the hassle.
Phileosophos (and others have) noted - "... don't have the kind of familiarity with WinDbg necessary to find such things. And I doubt all that many other developers do either."
What most of us need is a handy "what to do and how to do it when we need it" guide. Guess what? If you have downloaded the free MS Windows Debugger, you already have this guide.
It's the attached "Debugging Help" file. Not only does it explain how to use the 'debugger' interface, it's got a built-in step-by-step guide on "how to" debug specific kinds of problems. Open the Help File and then go to the "Debugging Techniques" section on the "Contents" tab. A few of the sections are: Elementary Degbugging techniques (you probably already know these being a developer), Bug Checks (Blue Screens), RPC Debugging, Plug and Play, Advanced Techniques, etc. I highly recommend that anyone new to debugging review what's here (since it 4 MB, there's a LOT of information available). Especially since it's free!
Of course, there are also several books on the market that cover these topics as well.
Thanks Mark, another great post. However, you can't rely on software tests like Windows Vista Memory Diagnostic to detect bad RAM, as they will only prove the RAM is faulty and not that it is good.
The best way to test RAM is to either use a hardware RAM tester or swap the suspect RAM with known good RAM and test over a period of time to see if the symptoms disappear.
I dont think it would make any difference if the user switched from IE and WMP.
From my understanding of the post, the Nvidia component had inserted itself as somekind of filter on the WinSock TCP/IP stack. This is the core TCP/IP API for Windows and its very likely that both Firefox and VLC would both use the same API.
Just my $0.02.....
Mark: This might be a bit too personal, but how much gaming do you do? What games are you playing?
I gather module statistics in reported crash dumps for uTorrent, and Nvappfilter the leading cause of crashes for us.
It's really surprising how many legitimate modules cause other applications to crash - orders of magnitudes more than real application bugs, at least in our case and Explorer's, as pointed out earlier.
Oh Dear Lord! I have been having this exact same issue on my machine and put it down to an Axis Media control problem.
I just disabled the filters and no crashes for me anymore!
Joel Peterson: I like first-person shooters, both single and multiplayer, but prefer multiplayer. I've been an addict of the Battle Field series starting with 1942, then Vietnam, 2 and now 2142.
Hey Mark: I agree with Ilya above, it's fantastic you continue with these posts. I'd like to thank you for your website, your blogs and your tools; they're what made using Windows bearable for me, because they gave me some control. Please don't stop what you're doing.
On the other hand, I *really* rediscovered the joy of computing when I switched to linux... a different solution to yours, I guess. (chuckles)
Thanks! This post helped me track down a persistent problem I've had launching IE. It was strange, because I've used tools to reset and remove all my third party plug-ins, but did not resolve the problem. The strangeness was compounded by the fact that I could run IE fine as an Administrator, but not as the default identity.
I used the debugger the way you demonstrated, and found that the cause was a dll from Pantone written for XP which had hooks in to the browser rather than installing as a plug-in. As soon as I uninstalled Pantone Colorist, everything works fine.
Thank you very much for this informative blog and excellent advice. I am grateful for your tools and really look forward to seeing what you guys have been able to do in the transition from Winternals ERD Commander to DART. Would DEM provide some kind of functionality to help IT shops better capture and monitor crashdumps and the even more dreaded BSODs? Maybe you can talk about these in a future blog posting? :-)