Welcome to TechNet Blogs Sign in | Join | Help

The Case of the Crashed Phone Call

David Solomon, my coauthor for the Windows Internals books, was recently in the middle of an important VOIP call on Skype when the audio suddenly garbled. A second later the system blue screened. He called back after the reboot, but a half hour later the person on the other seemed to stop talking mid-word and the system crashed again. The conversation was essentially over anyway, and since he’d explained the first drop, Dave decided not to call back and formally end the call, but to investigate the cause of the crashes. He launched Windbg from the Debugging Tools for Windows package, selected Open Crash Dump from the File menu, and chose %Systemroot%\Memory.dmp.

He’d previously configured Windbg to use the Microsoft public symbol server by entering “srv*c:\symbols*http://msdl.microsoft.com/download/symbols” in the Windbg symbols configuration dialog, so Windbg knew how to interpret the crash dump file.  When Windbg loads a crash dump file, it automatically executes a heuristics-based analysis engine that identifies the driver or system component most likely responsible for the crash. The analysis output pointed at the NETw4v64.sys device driver:

image

When you click on the “!analyze –v” hyperlink in the output, Windbg prints out some of the data it uses in its analysis. The analysis heuristics aren’t perfect, so Dave always clicks the link to look at the additional data, specifically the stack trace at the time of the crash and possibly memory locations associated with the crash. The stack trace records the nesting of function calls on the processor from which the kernel’s crash function, KeBugCheckEx, was called. In this case the stack looked like this:

image

You read the stack from bottom to top to follow the chronology of function calls. The trace shows that some code in NETw4v64 called the kernel’s (“nt”) KeAcquireSpinLockRaiseToDpc function. NETw4v64’s stack frame doesn’t have a text function name, which is expected for drivers that aren’t part of Windows and therefore don’t have symbols on the Microsoft symbol server. The next higher frame indicates that KeAcquireSpinLockRaiseToDpc called KiPageFault, most likely not directly, but as the result of a reference to a virtual memory address that wasn’t currently resident in physical memory. KiPageFault then called KeBugCheckEx with stop code A, which the extended analysis output describes as IRQL_NOT_LESS_OR_EQUAL:

image

Dave hypothesized that the NETw4v64 driver had called the kernel with a corrupted pointer that triggered the invalid memory reference. This particular crash might have been the result of random corruption, even by another driver, so he looked in the %Systemroot%\Minidump directory for the dump file for the first crash. On Windows Vista, the operating system he was running, the system always saves a kernel-memory dump to %Systemroot%\Memory.dmp, overwriting the previous dump, and archives an abbreviated form of the dump, called a minidump, to %Systemroot%\Minidump. He followed the same steps for the second dump and the analysis engine reported the exact same cause for the crash, down to the same corrupted memory pointer value.

Without performing a meticulous manual analysis of a dump, you can’t be certain that the driver the heuristics point at is the culprit, but the first rule of crash mitigation is to make sure you have the latest versions of any implicated drivers. Sometimes Windows Update has optional updates that don’t apply automatically, so Dave went to the %Systemroot%\System32\drivers directory to investigate the NETw4v64.sys file for clues as to what device it was for. The file properties dialog showed that it was version 11.5 of the “Intel Wireless WiFi Link Driver”:

image

Armed with the knowledge that it was an Intel wireless network driver, he opened Device Manager, expanded the Network Adapters node and found a device with a similar name:

image 

He right-clicked and chose “Update Driver Software…” from the context menu to launch the driver update wizard, and told it to check Windows Update for a newer version. Unfortunately, it reported that he had the most current version installed:

image

Sometimes OEMs have drivers posted on their Web sites that they haven’t yet been made available to Windows Update, so Dave next went to Dell, the brand of his laptop, to check the version there. Again, the version he found posted seemed to match the one he had:

image

OEMs often get hardware vendors to create custom versions of hardware tuned for specific cost, power, capability or size requirements. The original hardware vendor will therefore not post drivers for an OEM-only device or post drivers that are generic and might not take advantage of OEM-specific features.  It’s always worth checking, though, so Dave went to Intel’s site. To his chagrin, not only was there a newer version that installed and worked as expected, but the Intel driver was version 12.1, a major release number higher than the one Dell was hosting:

image

Intel also conveniently offered the driver in a “Drivers-Only” download that was a mere 7MB, one tenth the size of the package on Dell’s site that also includes value-add management software. 

Dave couldn’t conclusively close the case because he couldn’t be sure that the Intel driver was the actual cause of the crashes, but the crashes haven’t reoccurred. Even if the Intel driver wasn’t the root cause, Dave was happy that he picked up a newer version that most likely had performance, reliability and maybe even power-management improvements. The case is a great example of simple dump analysis and the lesson that Windows Update and even an OEM’s site might not have the most up-to-date drivers. Hopefully, Dell will start leveraging Windows Update to provide its customers the latest drivers.

Pushing the Limits of Windows: Virtual Memory

In my first Pushing the Limits of Windows post, I discussed physical memory limits, including the limits imposed by licensing, implementation, and driver compatibility. This time I’m turning my attention to another fundamental resource, virtual memory. Virtual memory separates a program’s view of memory from the system’s physical memory, so an operating system decides when and if to store the program’s code and data in physical memory and when to store it in a file. The major advantage of virtual memory is that it allows more processes to execute concurrently than might otherwise fit in physical memory.

While virtual memory has limits that are related to physical memory limits, virtual memory has limits that derive from different sources and that are different depending on the consumer. For example, there are virtual memory limits that apply to individual processes that run applications, the operating system, and for the system as a whole. It's important to remember as you read this that virtual memory, as the name implies, has no direct connection with physical memory. Windows assigning the file cache a certain amount of virtual memory does not dictate how much file data it actually caches in physical memory; it can be any amount from none to more than the amount that's addressable via virtual memory.

Process Address Spaces

Each process has its own virtual memory, called an address space, into which it maps the code that it executes and the data that the code references and manipulates. A 32-bit process uses 32-bit virtual memory address pointers, which creates an absolute upper limit of 4GB (2^32) for the amount of virtual memory that a 32-bit process can address. However, so that the operating system can reference its own code and data and the code and data of the currently-executing process without changing address spaces, the operating system makes its virtual memory visible in the address space of every process. By default, 32-bit versions of Windows split the process address space evenly between the system and the active process, creating a limit of 2GB for each:

 image

Applications might use Heap APIs, the .NET garbage collector, or the C runtime malloc library to allocate virtual memory, but under the hood all of these rely on the VirtualAlloc API. When an application runs out of address space then VirtualAlloc, and therefore the memory managers layered on top of it, return errors (represented by a NULL address). The Testlimit utility, which I wrote for the 4th Edition of Windows Internals to demonstrate various Windows limits,  calls VirtualAlloc repeatedly until it gets an error when you specify the –r switch. Thus, when you run the 32-bit version of Testlimit on 32-bit Windows, it will consume the entire 2GB of its address space:

image

2010 MB isn’t quite 2GB, but Testlimit’s other code and data, including its executable and system DLLs, account for the difference. You can see the total amount of address space it’s consumed by looking at its Virtual Size in Process Explorer:

image

Some applications, like SQL Server and Active Directory, manage large data structures and perform better the more that they can load into their address space at the same time. Windows NT 4 SP3 therefore introduced a boot option, /3GB, that gives a process 3GB of its 4GB address space by reducing the size of the system address space to 1GB, and Windows XP and Windows Server 2003 introduced the /userva option that moves the split anywhere between 2GB and 3GB:

 image

To take advantage of the address space above the 2GB line, however, a process must have the ‘large address space aware’ flag set in its executable image. Access to the additional virtual memory is opt-in because some applications have assumed that they’d be given at most 2GB of the address space. Since the high bit of a pointer referencing an address below 2GB is always zero, they would use the high bit in their pointers as a flag for their own data, clearing it of course before referencing the data. If they ran with a 3GB address space they would inadvertently truncate pointers that have values greater than 2GB, causing program errors including possible data corruption.

All Microsoft server products and data intensive executables in Windows are marked with the large address space awareness flag, including Chkdsk.exe, Lsass.exe (which hosts Active Directory services on a domain controller), Smss.exe (the session manager), and Esentutl.exe (the Active Directory Jet database repair tool). You can see whether an image has the flag with the Dumpbin utility, which comes with Visual Studio:

image

Testlimit is also marked large-address aware, so if you run it with the –r switch when booted with the 3GB of user address space, you’ll see something like this:

image

Because the address space on 64-bit Windows is much larger than 4GB, something I’ll describe shortly, Windows can give 32-bit processes the maximum 4GB that they can address and use the rest for the operating system’s virtual memory. If you run Testlimit on 64-bit Windows, you’ll see it consume the entire 32-bit addressable address space:

image

64-bit processes use 64-bit pointers, so their theoretical maximum address space is 16 exabytes (2^64). However, Windows doesn’t divide the address space evenly between the active process and the system, but instead defines a region in the address space for the process and others for various system memory resources, like system page table entries (PTEs), the file cache, and paged and non-paged pools.

The size of the process address space is different on IA64 and x64 versions of Windows where the sizes were chosen by balancing what applications need against the memory costs of the overhead (page table pages and translation lookaside buffer - TLB - entries) needed to support the address space. On x64, that’s 8192GB (8TB) and on IA64 it’s 7168GB (7TB - the 1TB difference from x64 comes from the fact that the top level page directory on IA64 reserves slots for Wow64 mappings). On both IA64 and x64 versions of Windows, the size of the various resource address space regions is 128GB (e.g. non-paged pool is assigned 128GB of the address space), with the exception of the file cache, which is assigned 1TB. The address space of a 64-bit process therefore looks something like this:

image

The figure isn’t drawn to scale, because even 8TB, much less 128GB, would be a small sliver. Suffice it to say that like our universe, there’s a lot of emptiness in the address space of a 64-bit process.

When you run the 64-bit version of Testlimit (Testlimit64) on 64-bit Windows with the –r switch, you’ll see it consume 8TB, which is the size of the part of the address space it can manage:

image

image 

Committed Memory

Testlimit’s –r switch has it reserve virtual memory, but not actually commit it. Reserved virtual memory can’t actually store data or code, but applications sometimes use a reservation to create a large block of virtual memory and then commit it as needed to ensure that the committed memory is contiguous in the address space. When a process commits a region of virtual memory, the operating system guarantees that it can maintain all the data the process stores in the memory either in physical memory or on disk.  That means that a process can run up against another limit: the commit limit.

As you’d expect from the description of the commit guarantee, the commit limit is the sum of physical memory and the sizes of the paging files. In reality, not quite all of physical memory counts toward the commit limit since the operating system reserves part of physical memory for its own use. The amount of committed virtual memory for all the active processes, called the current commit charge, cannot exceed the system commit limit. When the commit limit is reached, virtual allocations that commit memory fail. That means that even a standard 32-bit process may get virtual memory allocation failures before it hits the 2GB address space limit.

The current commit charge and commit limit is tracked by Process Explorer in its System Information window in the Commit Charge section and in the Commit History bar chart and graph:

image  image

Task Manager prior to Vista and Windows Server 2008 shows the current commit charge and limit similarly, but calls the current commit charge "PF Usage" in its graph:

image

On Vista and Server 2008, Task Manager doesn't show the commit charge graph and labels the current commit charge and limit values with "Page File" (despite the fact that they will be non-zero values even if you have no paging file):

image

You can stress the commit limit by running Testlimit with the -m switch, which directs it to allocate committed memory. The 32-bit version of Testlimit may or may not hit its address space limit before hitting the commit limit, depending on the size of physical memory, the size of the paging files and the current commit charge when you run it. If you're running 32-bit Windows and want to see how the system behaves when you hit the commit limit, simply run multiple instances of Testlimit until one hits the commit limit before exhausting its address space.

Note that, by default, the paging file is configured to grow, which means that the commit limit will grow when the commit charge nears it. And even when when the paging file hits its maximum size, Windows is holding back some memory and its internal tuning, as well as that of applications that cache data, might free up more. Testlimit anticipates this and when it reaches the commit limit, it sleeps for a few seconds and then tries to allocate more memory, repeating this indefinitely until you terminate it.

If you run the 64-bit version of Testlimit, it will almost certainly will hit the commit limit before exhausting its address space, unless physical memory and the paging files sum to more than 8TB, which as described previously is the size of the 64-bit application-accessible address space. Here's the partial output of the 64-bit Testlimit  running on my 8GB system (I specified an allocation size of 100MB to make it leak more quickly):

 image

And here's the commit history graph with steps when Testlimit paused to allow the paging file to grow:

image

When system virtual memory runs low, applications may fail and you might get strange error messages when attempting routine operations. In most cases, though, Windows will be able present you the low-memory resolution dialog, like it did for me when I ran this test:

image

After you exit Testlimit, the commit limit will likely drop again when the memory manager truncates the tail of the paging file that it created to accommodate Testlimit's extreme commit requests. Here, Process Explorer shows that the current limit is well below the peak that was achieved when Testlimit was running:

image

Process Committed Memory

Because the commit limit is a global resource whose consumption can lead to poor performance, application failures and even system failure, a natural question is 'how much are processes contributing the commit charge'? To answer that question accurately, you need to understand the different types of virtual memory that an application can allocate.

Not all the virtual memory that a process allocates counts toward the commit limit. As you've seen, reserved virtual memory doesn't. Virtual memory that represents a file on disk, called a file mapping view, also doesn't count toward the limit unless the application asks for copy-on-write semantics, because Windows can discard any data associated with the view from physical memory and then retrieve it from the file. The virtual memory in Testlimit's address space where its executable and system DLL images are mapped therefore don't count toward the commit limit. There are two types of process virtual memory that do count toward the commit limit: private and pagefile-backed.

Private virtual memory is the kind that underlies the garbage collector heap, native heap and language allocators. It's called private because by definition it can't be shared between processes. For that reason, it's easy to attribute to a process and Windows tracks its usage with the Private Bytes performance counter. Process Explorer displays a process private bytes usage in the Private Bytes column, in the Virtual Memory section of the Performance page of the process properties dialog, and displays it in graphical form on the Performance Graph page of the process properties dialog. Here's what Testlimit64 looked like when it hit the commit limit:

image

image

Pagefile-backed virtual memory is harder to attribute, because it can be shared between processes. In fact, there's no process-specific counter you can look at to see how much a process has allocated or is referencing. When you run Testlimit with the -s switch, it allocates pagefile-backed virtual memory until it hits the commit limit, but even after consuming over 29GB of commit, the virtual memory statistics for the process don't provide any indication that it's the one responsible:

image

For that reason, I added the -l switch to Handle a while ago. A process must open a pagefile-backed virtual memory object, called a section, for it to create a mapping of pagefile-backed virtual memory in its address space. While Windows preserves existing virtual memory even if an application closes the handle to the section that it was made from, most applications keep the handle open.  The -l switch prints the size of the allocation for pagefile-backed sections that processes have open. Here's partial output for the handles open by Testlimit after it has run with the -s switch:

image

You can see that Testlimit is allocating pagefile-backed memory in 1MB blocks and if you summed the size of all the sections it had opened, you'd see that it was at least one of the processes contributing large amounts to the commit charge.

How Big Should I Make the Paging File?

Perhaps one of the most commonly asked questions related to virtual memory is, how big should I make the paging file? There’s no end of ridiculous advice out on the web and in the newsstand magazines that cover Windows, and even Microsoft has published misleading recommendations. Almost all the suggestions are based on multiplying RAM size by some factor, with common values being 1.2, 1.5 and 2. Now that you understand the role that the paging file plays in defining a system’s commit limit and how processes contribute to the commit charge, you’re well positioned to see how useless such formulas truly are.

Since the commit limit sets an upper bound on how much private and pagefile-backed virtual memory can be allocated concurrently by running processes, the only way to reasonably size the paging file is to know the maximum total commit charge for the programs you like to have running at the same time. If the commit limit is smaller than that number, your programs won’t be able to allocate the virtual memory they want and will fail to run properly.

So how do you know how much commit charge your workloads require? You might have noticed in the screenshots that Windows tracks that number and Process Explorer shows it: Peak Commit Charge. To optimally size your paging file you should start all the applications you run at the same time, load typical data sets, and then note the commit charge peak (or look at this value after a period of time where you know maximum load was attained). Set the paging file minimum to be that value minus the amount of RAM in your system (if the value is negative, pick a minimum size to permit the kind of crash dump you are configured for). If you want to have some breathing room for potentially large commit demands, set the maximum to double that number.

Some feel having no paging file results in better performance, but in general, having a paging file means Windows can write pages on the modified list (which represent pages that aren’t being accessed actively but have not been saved to disk) out to the paging file, thus making that memory available for more useful purposes (processes or file cache). So while there may be some workloads that perform better with no paging file, in general having one will mean more usable memory being available to the system (never mind that Windows won’t be able to write kernel crash dumps without a paging file sized large enough to hold them).

Paging file configuration is in the System properties, which you can get to by typing “sysdm.cpl” into the Run dialog, clicking on the Advanced tab, clicking on the Performance Options button, clicking on the Advanced tab (this is really advanced), and then clicking on the Change button:

image

You’ll notice that the default configuration is for Windows to automatically manage the page file size. When that option is set on Windows XP and Server 2003,  Windows creates a single paging file that’s minimum size is 1.5 times RAM if RAM is less than 1GB, and RAM if it's greater than 1GB, and that has a maximum size that's three times RAM. On Windows Vista and Server 2008, the minimum is intended to be large enough to hold a kernel-memory crash dump and is RAM plus 300MB or 1GB, whichever is larger. The maximum is either three times the size of RAM or 4GB, whichever is larger. That explains why the peak commit on my 8GB 64-bit system that’s visible in one of the screenshots is 32GB. I guess whoever wrote that code got their guidance from one of those magazines I mentioned!

A couple of final limits related to virtual memory are the maximum size and number of paging files supported by Windows. 32-bit Windows has a maximum paging file size of 16TB (4GB if you for some reason run in non-PAE mode) and 64-bit Windows can having paging files that are up to 16TB in size on x64 and 32TB on IA64. For all versions, Windows supports up to 16 paging files, where each must be on a separate volume.

The Case of the Slooooow System

A few weeks ago my wife complained that her Vista desktop was not responding to her typing or mouse clicks. Given the importance of the customer, I immediately sat down at the system to troubleshoot.  It wasn’t completely hung, but extremely sluggish. For example, the mouse moved and when I clicked on the start button the start menu opened after about 30 seconds. I suspected that something was hogging the CPU and likely could have resolved the problem simply by logging off or rebooting, but knew that if I didn’t determine the root cause and address it, she’d likely be calling on my technical support services again in the near future. In any case, stooping to that kind of troubleshooting hack is beneath my dignity. I therefore set out to investigate.

My first step was to run Process Explorer to see which process was using the CPU. After a few minutes Process Explorer finally appeared and showed that not one, but two processes were involved, each consuming 50% of the CPU: Iexplore.exe and Dllhost.exe. Iexplore is Internet Explorer (IE) and I suspected that IE itself wasn’t the problem, but that it was a browser helper object (BHO), ActiveX control, or some other plugin loaded into IE. Similarly, Dllhost.exe is the host process for out-of-process COM server DLLs, so it was probably not at fault, but the COM server loaded into it. Both required digging deeper and I decided to tackle IE first.

In order to try and get some CPU headroom in which to operate, I suspended the Dllhost process by selecting it in Process Explorer, right-clicking to open the process context menu, and selecting the Suspend entry:

image 

That put the Dllhost process to sleep and, as I expected, that freed up 50% of the CPU. That’s because the computer was a dual-core system and so to consume 100% of the available CPU cycles a process would have to have two threads, each hogging one of the cores. Most bugs I've seen that result in the CPU being pegged are caused by a single thread.

Processes don’t execute code, threads do, so I needed to look inside the IE process to see what thread or threads were running. I double-clicked on Iexplore.exe in Process Explorer to open its process properties dialog and switched to the Threads page. Several threads were running, but one was dominating the CPU:

image 

From past experience I knew that Ieframe.dll was part of IE, but to be sure I clicked on the modules button on the Threads tab of the Properties dialog and switched to the Details page of the resulting Shell properties dialog:

image 

The description didn't give me a clue as the thread's specific purpose, so I moved to the second clue about the thread, its start function. Because I had configured Process Explorer to retrieve symbols for Windows images from the Microsoft symbol server in Options->Configure Symbols, Process Explorer showed the name of the function where each thread began executing. Sometimes the DLL or function where a thread starts executing is enough to identify the thread’s purpose or the software causing a problem. In this case, the thread began in a function named CTablWindow::_TabWindowThreadProc. The function name hints that it’s the one in which the main thread of a tab starts running, but that still wasn’t enough to tell me why the thread was running so much; I needed to dig even deeper and look inside the thread to see where it was executing.

To look at what the thread was up to, I double-clicked on it in the Threads list to open the Thread Stack dialog, which shows the functions on the thread’s stack. A stack is essentially an execution history, where each function listed called the one above it on the list and the function at the top of the list is the one most recently executed by the thread at the time of Process Explorer looks at the stack. I scrolled through the list, looking for frames that referenced 3rd-party DLLs or Microsoft IE plugins, since they would be far more likely to have a bug than IE’s own code. Sure enough, I found frames pointing at a popular 3rd-party ActiveX control, Adobe Flash:

image

Just to be sure that I hadn’t happened to catch Flash running when a different component was using most of the CPU time, I closed and reopened the stack dialog several times, but all of them pointed at Flash.

The first thing I do when I suspect that some software is causing a problem is to check the vendor’s web site to make sure that I have the latest version. I opened the Process Explorer DLL view and looked at Flash.ocx’s version, went to Adobe’s site and looked at the version of the current Flash download, and they were the same. 

I was at a dead end. I couldn’t know for sure if Flash had a bug or, more likely, there was a Flash application that had a bug, nor could I be sure that the problem wouldn’t recur. I tried to determine which site was hosting the Flash content by closing tabs one by one, but when I had close them all the thread was still running.

At this point the only options I had were to uninstall Flash and leave my wife with a degraded web experience, or terminate IE to stop the current CPU usage and hope that it wouldn’t happen again. I chose the latter and the case remains open. Since investigating this I’ve seen the same Flash behavior again on my wife’s system and on my own, so have been vigilantly watching the Adobe site for a new version just in case its due to a bug in Flash itself. I was disappointed that there was no actionable result of the investigation, but at least I knew what had caused the CPU usage.

I now turned my attention the Dllhost problem with the hope that I'd meet with better success. Process Explorer lists in a tooltip the component or components loaded into hosting processes like Svchost.exe (the Windows service host process), Rundll32 (the Control Panel applet hosting process), Taskeng.exe (the scheduled task hosting process on Vista and Server 2008), and Dllhost.exe. I moved the mouse over Dllhost.exe to see what COM server it was running:

image

It was running the Thumbnail Cache COM server, whose job it is to create Explorer thumbnails for image and media files. It is part of Windows, so once again I had to look inside the process for more clues. I resumed the Dllhost process I had suspended earlier and opened the process properties threads page:

image 

The thread consuming the most CPU in this case started in Quartz.dll’s ObjectThread function. I looked at its properties and saw that it was another Windows DLL, the DirectShow Runtime, with a generic function name:

image

Next, I double-clicked to look at the thread stack:

image

The first few frames were in User32.dll and Ntdll.dll, core Windows system DLLs, but frames 4-7 are in the Sonicmp4demux.ax (".ax" is an extension commonly used for DirectShow filters), a 3rd-party component. The function names for those frames were the same and didn't make sense because the Microsoft symbol server only stores symbols for software included in Windows. Several more stack snapshots confirmed that it was the code causing the CPU usage.

Now that I had my suspect, the next step was to check for a newer version. But first I had to figure out what software the DLL came with, which was harder than it seemed. I opened the DLL view to take a closer look at the version information, but the description didn't reveal anything:

image

There were no folders in the Start menu or items in the Add/Remove Programs list with Sonic in the name. I Windows-Live-searched (I expect that word to be added to Webster's any day now) for Sonic and found that it's part of the Roxio's CD and DVD authoring software suites. I looked in the start menu and sure enough, found a Roxio folder:

image

I ran the Roxio software to check its version number and discovered that the Creator application includes a built-in facility to check for updates. I ran it, but it came up empty:

image

I checked the Roxio web site just to be sure and it turned out there was a newer version that the built-in updater hadn't offered, perhaps because the update, according to the page, didn't offer anything new:

image

I downloaded it anyway (all 640MB of it!) and waited the 15 or so minutes for it to install. Then I checked the version information of Sonicmp4demux.ax to see if it was newer, but its version number, 1.4.402.60802, was the same as the one I'd seen in the DLL view and the file was two years old:

image

I could have uninstalled the software, which would ensure that the problem wouldn't return, but I wanted to keep Roxio for its DVD authoring functionality. I didn't care if I didn't get thumbnails for Roxio-specific image formats - I wasn't even sure there were any I'd ever see in Explorer - so I set out to see if I could disable just the Sonic demultiplexer. I could have searched the Registry for the DLL name, which is surely where it was registered, but that's a brute-force approach and if there were indirect or multiple references I could easily end up disabling more than just its thumbnail generation and possibly breaking something in Windows.

Process Monitor was the perfect tool for the job. Because I didn't know when the problem might reoccur - it might takes days to reproduce - I didn't want to just run it and let it consume all available virtual memory or disk space, so I set the History Depth in the Options menu to have Process Monitor retain only the most recent 1 million events:

image

I also set an Include filter for paths matching C:\Windows\System32\Dllhost.exe, minimized it, and let my wife have the system back.

The next day I came home from work, sat down at the computer and saw from Process Explorer that Dllhost.exe was back at it, consuming 50% of the CPU. I suspect that because it's a dual-core system, the problem had been showing up regularly, but my wife hadn't noticed it because the remaining CPU capacity was enough to mask it (another good reason to buy multi-core processors!). I brought Process Monitor to the foreground and noted it had seen 114,000 Dllhost operations, which was obviously way too many to scan through individually. I searched for "sonicmp4" and found a reference in a Registry query near the end of the trace:

image

The query is of a COM object registration for the demultiplexer. Because the COM object is a 3rd-party DLL, I was certain that that COM Class ID (CLSID) isn't hard-coded into Windows, so I went back to the first entry in the trace and searched for "A7DD215", the first few characters of the CLSID. The search found a match a few thousand operations earlier:

image

The CLSID was in the name of a Registry key under another COM object registration. I Windows-Live-searched (that just rolls off the tongue, doesn't it?) for the parent CLSID and found this KB article that explains that the registry key is where DirectShow filters register: http://msdn.microsoft.com/en-us/library/ms787560(VS.85).aspx  I took a look at the stack for the particular query to confirm that's the reason Dllhost was reading from there:

image

I was now confident that I could simply rename the Sonic filter registration key to prevent its use. I never delete registry keys when performing this kind of troubleshooting just in case the change disables important functionality or somehow breaks something else. I had seen from the traces that the thumbnail cache generator had come across an AVI file that caused it to load the Sonic demultiplexer, a format Windows is obviously able to handle on its own, so I was pretty sure things would continue to work. After terminating the Dllhost and making the change, I browsed to the same folder, deleted the thumbnails, and confirmed that there was no reduced functionality as far as I could tell. I then used Roxio to successfully burn a DVD with a number of AVI files. This case was closed.

My wife's system was now usable again, and though I wasn't able to close the Flash-related part of the case, at least I knew the cause and could keep an eye out for updates. More importantly, by solving the Dllhost part of the case, even if Flash went crazy again, her system would still be usable and she wouldn't be filing a critical support incident for it with me - thanks to Process Explorer and Process Monitor.

Where in the World is Mark Russinovich?

I haven't had a chance to write a new post in a while because I've been busy working on Windows, new Sysinternals tools and enhancements to existing ones, and the 5th edition of Windows Internals, so I thought that I'd update you on my speaking schedule, book status, and what's going on at Sysinternals.

My next event is one that anyone can easily attend live, or via recorded webcast: it's the third virtual roundtable in the Microsoft Springboard series of round tables I've been hosting. Springboard is program designed to connect IT pros with practical information, guidance and tools to help them in their evaluation and deployment of Windows without marketing fluff getting in the way. This next round table is on September 24 and takes on the topic of performance. As usual, we'll have a panel of MVPs and customers sharing their experiences and real world tips with you. You can sign up to watch it live and find the contact to send in your questions ahead of time here.

In addition to the round table, I've got a full conference schedule for the Fall, including three keynotes:

TechEd Hong Kong, October 8-10, Wanchai, Hong Kong

Virtualization Congress, October 15-16, London, UK

Microsoft Platforma, December 4-5, Moscow, Russia

I'm also returning to one of my favorite conferences, TechEd EMEA IT Pros. I love reconnecting with my speaker friends, the enthusiastic European attendees, and Barcelona. I'm delivering several sessions, including an updated "The Case of the Unexplained...", complete with all new examples.

TechEd EMEA IT Pro, November 3-7, Barcelona, Spain

I hope to see you at one of these events, and if attend one of my sessions please stop by and say hello.

The book, which is updated to focus exclusively on Windows Vista and Windows Server 2008, is well along and we're on track for publication in January. I'm writing it again with David Solomon, my coauthor on the previous two editions, and Alex Ionescu, who is new to this edition and contributing great content. With all the new information and experiments, the book is going to be around 250 pages longer, making it its bed-time reading value stretch even longer. You can find information on the book on its official home page here.

Finally, Bryce and I have some exciting Sysinternals updates, including a major Process Monitor update and enhancements to Process Explorer, planned for release in the coming weeks and months.

If you'd like to hear directly from me on what I'm up to at Microsoft, what's behind the Sysinternals operation, what new feature we're releasing in Process Monitor, and my views on Windows, operating system security, and more, check out my recent interview with TechNet Edge.

Pushing the Limits of Windows: Physical Memory

This is the first blog post in a series I'll write over the coming months called Pushing the Limits of Windows that describes how Windows and applications use a particular resource, the licensing and implementation-derived limits of the resource, how to measure the resource’s usage, and how to diagnose leaks. To be able to manage your Windows systems effectively you need to understand how Windows manages physical resources, such as CPUs and memory, as well as logical resources, such as virtual memory, handles, and window manager objects. Knowing the limits of those resources and how to track their usage enables you to attribute resource usage to the applications that consume them, effectively size a system for a particular workload, and identify applications that leak resources.

Physical Memory

One of the most fundamental resources on a computer is physical memory. Windows' memory manager is responsible with populating memory with the code and data of active processes, device drivers, and the operating system itself. Because most systems access more code and data than can fit in physical memory as they run, physical memory is in essence a window into the code and data used over time. The amount of memory can therefore affect performance, because when data or code a process or the operating system needs is not present, the memory manager must bring it in from disk.

Besides affecting performance, the amount of physical memory impacts other resource limits. For example, the amount of non-paged pool, operating system buffers backed by physical memory, is obviously constrained by physical memory. Physical memory also contributes to the system virtual memory limit, which is the sum of roughly the size of physical memory plus the maximum configured size of any paging files. Physical memory also can indirectly limit the maximum number of processes, which I'll talk about in a future post on process and thread limits.

Windows Server Memory Limits

Windows support for physical memory is dictated by hardware limitations, licensing, operating system data structures, and driver compatibility. The Memory Limits for Windows Releases page in MSDN documents the limits for different Windows versions, and within a version, by SKU.

You can see physical memory support licensing differentiation across the server SKUs for all versions of Windows. For example, the 32-bit version of Windows Server 2008 Standard supports only 4GB, while the 32-bit Windows Server 2008 Datacenter supports 64GB. Likewise, the 64-bit Windows Server 2008 Standard supports 32GB and the 64-bit Windows Server 2008 Datacenter can handle a whopping 2TB. There aren't many 2TB systems out there, but the Windows Server Performance Team knows of a couple, including one they had in their lab at one point. Here's a screenshot of Task Manager running on that system:

image

The maximum 32-bit limit of 128GB, supported by Windows Server 2003 Datacenter Edition, comes from the fact that structures the Memory Manager uses to track physical memory would consume too much of the system's virtual address space on larger systems. The Memory Manager keeps track of each page of memory in an array called the PFN database and, for performance, it maps the entire PFN database into virtual memory. Because it represents each page of memory with a 28-byte data structure, the PFN database on a 128GB system requires about 930MB. 32-bit Windows has a 4GB virtual address space defined by hardware that it splits by default between the currently executing user-mode process (e.g. Notepad) and the system. 980MB therefore consumes almost half the available 2GB of system virtual address space, leaving only 1GB for mapping the kernel, device drivers, system cache and other system data structures, making that a reasonable cut off:

image

That's also why the memory limits table lists lower limits for the same SKU's when booted with 4GB tuning (called 4GT and enabled with the Boot.ini's /3GB or /USERVA, and Bcdedit's /Set IncreaseUserVa boot options), because 4GT moves the split to give 3GB to user mode and leave only 1GB for the system. For improved performance, Windows Server 2008 reserves more for system address space by lowering its maximum 32-bit physical memory support to 64GB.

The Memory Manager could accommodate more memory by mapping pieces of the PFN database into the system address as needed, but that would add complexity and possibly reduce performance with the added overhead of map and unmap operations. It's only recently that systems have become large enough for that to be considered, but because the system address space is not a constraint for mapping the entire PFN database on 64-bit Windows, support for more memory is left to 64-bit Windows.

The maximum 2TB limit of 64-bit Windows Server 2008 Datacenter doesn't come from any implementation or hardware limitation, but Microsoft will only support configurations they can test. As of the release of Windows Server 2008, the largest system available anywhere was 2TB and so Windows caps its use of physical memory there.

Windows Client Memory Limits

64-bit Windows client SKUs support different amounts of memory as a SKU-differentiating feature, with the low end being 512MB for Windows XP Starter to 128GB for Vista Ultimate. All 32-bit Windows client SKUs, however, including Windows Vista, Windows XP and Windows 2000 Professional, support a maximum of 4GB of physical memory. 4GB is the highest physical address accessible with the standard x86 memory management mode. Originally, there was no need to even consider support for more than 4GB on clients because that amount of memory was rare, even on servers.

However, by the time Windows XP SP2 was under development, client systems with more than 4GB were foreseeable, so the Windows team started broadly testing Windows XP on systems with more than 4GB of memory. Windows XP SP2 also enabled Physical Address Extensions (PAE) support by default on hardware that implements no-execute memory because its required for Data Execution Prevention (DEP), but that also enables support for more than 4GB of memory.

What they found was that many of the systems would crash, hang, or become unbootable because some device drivers, commonly those for video and audio devices that are found typically on clients but not servers, were not programmed to expect physical addresses larger than 4GB. As a result, the drivers truncated such addresses, resulting in memory corruptions and corruption side effects. Server systems commonly have more generic devices and with simpler and more stable drivers, and therefore hadn't generally surfaced these problems. The problematic client driver ecosystem lead to the decision for client SKUs to ignore physical memory that resides above 4GB, even though they can theoretically address it.

32-bit Client Effective Memory Limits 

While 4GB is the licensed limit for 32-bit client SKUs, the effective limit is actually lower and dependent on the system's chipset and connected devices. The reason is that the physical address map includes not only RAM, but device memory as well, and x86 and x64 systems map all device memory below the 4GB address boundary to remain compatible with 32-bit operating systems that don't know how to handle addresses larger than 4GB. If a system has 4GB RAM and devices, like video, audio and network adapters, that implement windows into their device memory that sum to 500MB, 500MB of the 4GB of RAM will reside above the 4GB address boundary, as seen below:

image

The result is that, if you have a system with 3GB or more of memory and you are running a 32-bit Windows client, you may not be getting the benefit of all of the RAM.  On Windows 2000, Windows XP and Windows Vista RTM, you can see how much RAM Windows has accessible to it in the System Properties dialog, Task Manager's Performance page, and, on Windows XP and Windows Vista (including SP1), in the Msinfo32 and Winver utilities. On Window Vista SP1, some of these locations changed to show installed RAM, rather than available RAM, as documented in this Knowledge Base article.

On my 4GB laptop, when booted with 32-bit Vista, the amount of physical memory available is 3.5GB, as seen in the Msinfo32 utility:

image

You can see physical memory layout with the Meminfo tool by Alex Ionescu (who's contributing to the 5th Edition of the Windows Internals that I'm coauthoring with David Solomon). Here's the output of Meminfo when I run it on that system with the -r switch to dump physical memory ranges:

image

Note the gap in the memory address range from page 9F0000 to page 100000, and another gap from DFE6D000 to FFFFFFFF (4GB). However, when I boot that system with 64-bit Vista, all 4GB show up as available and you can see how Windows uses the remaining 500MB of RAM that are above the 4GB boundary:

image 

What's occupying the holes below 4GB? The Device Manager can answer that question. To check, launch "devmgmt.msc", select Resources by Connection in the View Menu, and expand the Memory node. On my laptop, the primary consumer of mapped device memory is, unsurprisingly, the video card, which consumes 256MB in the range E0000000-EFFFFFFF:

image

Other miscellaneous devices account for most of the rest, and the PCI bus reserves additional ranges for devices as part of the conservative estimation the firmware uses during boot.

The consumption of memory addresses below 4GB can be drastic on high-end gaming systems with large video cards. For example, I purchased one from a boutique gaming rig company that came with 4GB of RAM and two 1GB video cards. I hadn't specified the OS version and assumed that they'd put 64-bit Vista on it, but it came with the 32-bit version and as a result only 2.2GB of the memory was accessible by Windows. You can see a giant memory hole from 8FEF0000 to FFFFFFFF in this Meminfo output from the system after I installed 64-bit Windows:

image

Device Manager reveals that 512MB of the over 2GB hole is for the video cards (256MB each), and it looks like the firmware has reserved more for either dynamic mappings or because it was conservative in its estimate:

image

Even systems with as little as 2GB can be prevented from having all their memory usable under 32-bit Windows because of chipsets that aggressively reserve memory regions for devices. Our shared family computer, which we purchased only a few months ago from a major OEM, reports that only 1.97GB of the 2GB installed is available:

image

The physical address range from 7E700000 to FFFFFFFF is reserved by the PCI bus and devices, which leaves a theoretical maximum of 7E700000 bytes (1.976GB) of physical address space, but even some of that is reserved for device memory, which explains why Windows reports 1.97GB.

image

Because device vendors now have to submit both 32-bit and 64-bit drivers to Microsoft's Windows Hardware Quality Laboratories (WHQL) to obtain a driver signing certificate, the majority of device drivers today can probably handle physical addresses above the 4GB line. However, 32-bit Windows will continue to ignore memory above it because there is still some difficult to measure risk, and OEMs are (or at least should be) moving to 64-bit Windows where it's not an issue.

The bottom line is that you can fully utilize your system's memory (up the SKU's limit) with 64-bit Windows, regardless of the amount, and if you are purchasing a high end gaming system you should definitely ask the OEM to put 64-bit Windows on it at the factory.

Do You Have Enough Memory?

Regardless of how much memory your system has, the question is, is it enough? Unfortunately, there's no hard and fast rule that allows you to know with certainty. There is a general guideline you can use that's based on monitoring the system's "available" memory over time, especially when you're running memory-intensive workloads. Windows defines available memory as physical memory that's not assigned to a process, the kernel, or device drivers. As its name implies, available memory is available for assignment to a process or the system if required. The Memory Manager of course tries to make the most of that memory by using it as a file cache (the standby list), as well as for zeroed memory (the zero page list), and Vista's Superfetch feature prefetches data and code into the standby list and prioritizes it to favor data and code likely to be used in the near future.

If available memory becomes scarce, that means that processes or the system are actively using physical memory, and if it remains close to zero over extended periods of time, you can probably benefit by adding more memory. There are a number of ways to track available memory. On Windows Vista, you can indirectly track available memory by watching the Physical Memory Usage History in Task Manager, looking for it to remain close to 100% over time. Here's a screenshot of Task Manager on my 8GB desktop system (hmm, I think I might have too much memory!):

image

On all versions of Windows you can graph available memory using the Performance Monitor by adding the Available Bytes counter in the Memory performance counter group:

image 

You can see the instantaneous value in Process Explorer's System Information dialog, or, on versions of Windows prior to Vista, on Task Manager's Performance page.

Pushing the Limits

Out of CPU, memory and disk, memory is typically the most important for overall system performance. The more, the better. 64-bit Windows is the way to go to be sure that you're taking advantage of all of it, and 64-bit Windows can have other performance benefits that I'll talk about in a future Pushing the Limits blog post when I talk about virtual memory limits.

The Case of the Random IE and WMP Crashes

When I experienced a crash in Internet Explorer (IE) on my home 64-bit gaming system one day, I chalked it up to random third-party plug-in memory corruption. I moved on, but a few days later had another crash in IE. Then, Windows Media Player (WMP) started crashing every third or fourth time I used it:

 

Crashes in different programs seemed to point at a more fundamental problem. I had over-clocked the CPU, so I speculated that the rash of crashes were a side-effect of CPU overheating and reluctantly dialed back the clock multiplier to the factory specification.  To my dismay, however, the crashes continued. My next theory was that I had bad RAM, but the Windows Vista Memory Diagnostic failed to identify any problems.

Hardware problems seemingly cleared, my next move was to look at the process crash dumps to see if they held any clues. But first I had to find a crash dump to look at. Windows XP’s Application Error Reporting process always generates a dump before showing you the application crash dialog, and you can find the location of the dump by clicking to see the report details and then viewing the report’s technical information:

 

Windows Vista’s corresponding dialog doesn’t offer a way to get at a report’s technical information and it doesn’t generate a dump unless Microsoft’s Windows Error Reporting (WER) servers request it, which they only do for crashes reported in high volumes. Fortunately, WerFault, the process that presents the dialog, keeps the crashed process around until you press the Close Program button, which offers an opportunity to attach to the process with a debugger and examine it. You can see WerFault’s handle to a crashed Windows Media Player process in Process Explorer:

The next time I had a crash, I launched WinDbg, the Windows Debugger from the Debugging Tools for Windows package that’s available for free download from Microsoft. After making sure that I had the symbol configuration set to point at the Microsoft public symbol server (e.g. srv*c:\symbols*http://msdl.microsoft.com/download/symbols) in the Symbol File Path dialog, I went to the File menu and selected the “Attach to a Process...” menu entry:

 

That opens the WinDbg process selection dialog, which I scrolled through to find the crashed process. When I selected the process, WinDbg opened it and presented the same interface it does when it loads a crash dump, except that when you load a crash dump, you can execute the !analyze debugger command that uses heuristics to try and pinpoint the cause of the crash; when you perform a debugger attach, an analysis will just tell you what you already know, that you attached with a debugger:

 

Looking for a potential cause of a crash when attached requires looking at the stack of each thread in the process, so I opened the Processes and Threads and Call Stack dialogs in the View menu:

 

I started examining threads by selecting the first entry in the threads dialog:

 

The WinDbg command window usually grays and says “Busy” as WinDbg pulls symbols from the symbol server, after which the call stack dialog populates with the function nesting of the selected thread at the time of the crash. I examined each thread’s stack in turn, moving between threads by pressing the down arrow and then the enter key, hunting for a stack that had function names with the words “exception” or “fault” in them. Near the end of the list I came across this one:

 

I noticed that the top of the list is full of functions with “Exception” in their names. Looking down the list (up the stack), I saw that a function in Nvappfilter called Kernel32.dll’s HeapFree function, leading to the crash. The exception in the heap’s free routines meant that either the caller passed a bogus heap address or that the heap was already corrupted when the function executed. If a Windows DLL had been the caller I would have suspected the latter, but in this case the caller was a third-party DLL, which I could tell by the fact that WinDbg couldn’t locate symbol information for it and hence didn’t know the names of the functions within it. I confirmed that by issuing the lm (list module) command to look at its version information:

 

Nvappfilter was now my primary suspect, but I didn’t have direct evidence that it was responsible. I continued to use the system and followed the same debugging steps on the next several crashes. Whether it was IE, WMP or a game, the faulting stack was always the same, with Nvappfilter calling HeapFree. That’s still not conclusive proof, but the anecdotal evidence was pretty compelling.

At that point I went to see if there were updates for Nvappfilter, but I wasn’t sure what software package it was associated with. I entered its name in a Web search and discovered that it’s part of the nVidia’s FirstPacket feature that prioritizes game traffic and that’s included in the nForce motherboard’s software:

 

I went to nVidia’s site and downloaded the most recent nForce driver package, but it failed to update Nvappfilter.dll and I continued to have the crashes.

The nVidia control panel offers no way that I could find to prevent Nvappfilter from loading, so my only recourse was to manually disable it. I wasn’t using the FirstPacket feature, which I had previously been unaware of, so I wouldn’t miss it, but first I had to figure out how it configured Windows to load it. For that I turned to Autoruns, where I found references to Nvappfilter’s 32-bit and 64-bit versions in the Winsock Layered Service Provider (LSP) section:

 

I deleted all of Nvappfilter’s entries, rebooted the system and have been crash-free since. While I was writing this post, I checked again for nForce software updates to see if Nvappfilter had been updated. The latest version doesn’t look like it includes Nvappfilter or any other Winsock LSP, so assuming Nvappfilter was at fault, it’s no longer an issue.

One other thing I’ve done since I investigated these crashes is take advantage of Vista SP1’s “local dumps” functionality so that I'll automatically get a crash dump to investigate for any application crash I experience. If you create a key named HKLM\Software\Microsoft\Windows\Windows Error Reporting\LocalDumps, WerFault will always save a dump. Crashes go by default into %LOCALAPPDATA%\Crashdumps, but you can override that with a Registry value and also specify a limit on the number of crashes WerFault will keep.

Guest Post: The Case of the FrontPage Error

Welcome to the first guest "Case Of" blog post! I've received numerous great troubleshooting cases over the last two months and have selected this one, submitted by Troy Wolbrink, a corporate web master, as the first to share with you.

 

Troy ran into a problem with his web server and instead of rebooting, reinstalling, or calling Microsoft Product Support Services (who would have undoubtedly suggested the same steps Troy followed on his own, but cost Troy an incident), he used basic troubleshooting techniques to solve it in a few minutes. Like for everyone else that's submitted a well-documented case with screenshots and log files, I sent Troy a signed copy of Windows Internals as a thanks.

 

I'm including some of the cases I've received in my "Case of the Unexplained..." talk again at TechEd/IT Pro in June. In fact, even if you've watched the webcast from last November's TechEd/ITForum, be sure to come to the session as it consists of entirely new new cases.

 

Before I present the case, I want to share a clever user-produced video that serves as both a tutorial and demonstration of Zoomit, a screen magnifier and annotation tool I developed as an aid for my presentations. You can find the video here.

 

Enjoy Troy's post and the Zoomit video and please keep submitting your troubleshooting cases!

 

-Mark

 

The Case of the FrontPage Error

by Windows Detective Troy Wolbrink

 

I recently transitioned my website from shared hosting to my own dedicated server.  Parts of my website were using FrontPage Server Extensions (FPSE), such as the page which collected data and logged the results to a text file:

 

I installed FPSE using the Add/Remove Windows Components control panel applet, and then I did my best to configure them correctly with IIS.  But for some reason I could not get FPSE configured correctly such that my data collection form would work.  The error message from the browser was very vague:

 

I looked for more information in the Event Log on the server, but it had nothing relevant.

  

My plan was to eventually replace this site with one built on ASP.NET and SQL Server, so I wasn’t feeling motivated to become a FPSE expert just to solve this one problem.  But since other priorities are preventing this move to ASP.NET from happening for another year, so I had no choice but to investigate. I had just seen Mark’s talk on “The Case of the Unexplained...”, and I was inspired to run Process Monitor on my server to see if any clues to the problem might appear.  I excluded events from some obvious processes that had nothing to do with the problem.  This removed a lot of the noise.  Since my web page was “TntWebLog.htm” and since it was writing to “TntWebLog.csv” I configured it to highlight anything with a Path containing “TntWebLog”. 

 

To reproduce the problem, I pulled up my web browser and tried to submit the form.  Then back in Process Monitor I told it to stop capturing events.  I then scrolled down through the list looking for highlighted entries.  It was astonishing how quickly I found the problem.  FPSE was trying to create the file with the same name as the existing file:

 

 

Obviously something was up with file permissions here.  I checked the file permissions for TntWebLog.csv and it didn’t have a listing for the IUSR_WEBBOARD account which is what is configured in IIS for anonymous access: