Mark Russinovich’s technical blog covering topics such as Windows troubleshooting, technologies and security.
In my first Pushing the Limits of Windows post, I discussed physical memory limits, including the limits imposed by licensing, implementation, and driver compatibility. Here’s the index of the entire Pushing the Limits series. While they can stand on their own, they assume that you read them in order.
Pushing the Limits of Windows: Physical Memory Pushing the Limits of Windows: Virtual Memory Pushing the Limits of Windows: Paged and Nonpaged Pool Pushing the Limits of Windows: Processes and Threads Pushing the Limits of Windows: Handles Pushing the Limits of Windows: USER and GDI Objects – Part 1 Pushing the Limits of Windows: USER and GDI Objects – Part 2
Pushing the Limits of Windows: Physical Memory
Pushing the Limits of Windows: Virtual Memory
Pushing the Limits of Windows: Paged and Nonpaged Pool
Pushing the Limits of Windows: Processes and Threads
Pushing the Limits of Windows: Handles
Pushing the Limits of Windows: USER and GDI Objects – Part 1
Pushing the Limits of Windows: USER and GDI Objects – Part 2
This time I’m turning my attention to another fundamental resource, virtual memory. Virtual memory separates a program’s view of memory from the system’s physical memory, so an operating system decides when and if to store the program’s code and data in physical memory and when to store it in a file. The major advantage of virtual memory is that it allows more processes to execute concurrently than might otherwise fit in physical memory.
While virtual memory has limits that are related to physical memory limits, virtual memory has limits that derive from different sources and that are different depending on the consumer. For example, there are virtual memory limits that apply to individual processes that run applications, the operating system, and for the system as a whole. It's important to remember as you read this that virtual memory, as the name implies, has no direct connection with physical memory. Windows assigning the file cache a certain amount of virtual memory does not dictate how much file data it actually caches in physical memory; it can be any amount from none to more than the amount that's addressable via virtual memory.
Each process has its own virtual memory, called an address space, into which it maps the code that it executes and the data that the code references and manipulates. A 32-bit process uses 32-bit virtual memory address pointers, which creates an absolute upper limit of 4GB (2^32) for the amount of virtual memory that a 32-bit process can address. However, so that the operating system can reference its own code and data and the code and data of the currently-executing process without changing address spaces, the operating system makes its virtual memory visible in the address space of every process. By default, 32-bit versions of Windows split the process address space evenly between the system and the active process, creating a limit of 2GB for each:
Applications might use Heap APIs, the .NET garbage collector, or the C runtime malloc library to allocate virtual memory, but under the hood all of these rely on the VirtualAlloc API. When an application runs out of address space then VirtualAlloc, and therefore the memory managers layered on top of it, return errors (represented by a NULL address). The Testlimit utility, which I wrote for the 4th Edition of Windows Internals to demonstrate various Windows limits, calls VirtualAlloc repeatedly until it gets an error when you specify the –r switch. Thus, when you run the 32-bit version of Testlimit on 32-bit Windows, it will consume the entire 2GB of its address space:
2010 MB isn’t quite 2GB, but Testlimit’s other code and data, including its executable and system DLLs, account for the difference. You can see the total amount of address space it’s consumed by looking at its Virtual Size in Process Explorer:
Some applications, like SQL Server and Active Directory, manage large data structures and perform better the more that they can load into their address space at the same time. Windows NT 4 SP3 therefore introduced a boot option, /3GB, that gives a process 3GB of its 4GB address space by reducing the size of the system address space to 1GB, and Windows XP and Windows Server 2003 introduced the /userva option that moves the split anywhere between 2GB and 3GB:
To take advantage of the address space above the 2GB line, however, a process must have the ‘large address space aware’ flag set in its executable image. Access to the additional virtual memory is opt-in because some applications have assumed that they’d be given at most 2GB of the address space. Since the high bit of a pointer referencing an address below 2GB is always zero, they would use the high bit in their pointers as a flag for their own data, clearing it of course before referencing the data. If they ran with a 3GB address space they would inadvertently truncate pointers that have values greater than 2GB, causing program errors including possible data corruption.
All Microsoft server products and data intensive executables in Windows are marked with the large address space awareness flag, including Chkdsk.exe, Lsass.exe (which hosts Active Directory services on a domain controller), Smss.exe (the session manager), and Esentutl.exe (the Active Directory Jet database repair tool). You can see whether an image has the flag with the Dumpbin utility, which comes with Visual Studio:
Testlimit is also marked large-address aware, so if you run it with the –r switch when booted with the 3GB of user address space, you’ll see something like this:
Because the address space on 64-bit Windows is much larger than 4GB, something I’ll describe shortly, Windows can give 32-bit processes the maximum 4GB that they can address and use the rest for the operating system’s virtual memory. If you run Testlimit on 64-bit Windows, you’ll see it consume the entire 32-bit addressable address space:
64-bit processes use 64-bit pointers, so their theoretical maximum address space is 16 exabytes (2^64). However, Windows doesn’t divide the address space evenly between the active process and the system, but instead defines a region in the address space for the process and others for various system memory resources, like system page table entries (PTEs), the file cache, and paged and non-paged pools.
The size of the process address space is different on IA64 and x64 versions of Windows where the sizes were chosen by balancing what applications need against the memory costs of the overhead (page table pages and translation lookaside buffer - TLB - entries) needed to support the address space. On x64, that’s 8192GB (8TB) and on IA64 it’s 7168GB (7TB - the 1TB difference from x64 comes from the fact that the top level page directory on IA64 reserves slots for Wow64 mappings). On both IA64 and x64 versions of Windows, the size of the various resource address space regions is 128GB (e.g. non-paged pool is assigned 128GB of the address space), with the exception of the file cache, which is assigned 1TB. The address space of a 64-bit process therefore looks something like this:
The figure isn’t drawn to scale, because even 8TB, much less 128GB, would be a small sliver. Suffice it to say that like our universe, there’s a lot of emptiness in the address space of a 64-bit process.
When you run the 64-bit version of Testlimit (Testlimit64) on 64-bit Windows with the –r switch, you’ll see it consume 8TB, which is the size of the part of the address space it can manage:
Testlimit’s –r switch has it reserve virtual memory, but not actually commit it. Reserved virtual memory can’t actually store data or code, but applications sometimes use a reservation to create a large block of virtual memory and then commit it as needed to ensure that the committed memory is contiguous in the address space. When a process commits a region of virtual memory, the operating system guarantees that it can maintain all the data the process stores in the memory either in physical memory or on disk. That means that a process can run up against another limit: the commit limit.
As you’d expect from the description of the commit guarantee, the commit limit is the sum of physical memory and the sizes of the paging files. In reality, not quite all of physical memory counts toward the commit limit since the operating system reserves part of physical memory for its own use. The amount of committed virtual memory for all the active processes, called the current commit charge, cannot exceed the system commit limit. When the commit limit is reached, virtual allocations that commit memory fail. That means that even a standard 32-bit process may get virtual memory allocation failures before it hits the 2GB address space limit.
The current commit charge and commit limit is tracked by Process Explorer in its System Information window in the Commit Charge section and in the Commit History bar chart and graph:
Task Manager prior to Vista and Windows Server 2008 shows the current commit charge and limit similarly, but calls the current commit charge "PF Usage" in its graph:
On Vista and Server 2008, Task Manager doesn't show the commit charge graph and labels the current commit charge and limit values with "Page File" (despite the fact that they will be non-zero values even if you have no paging file):
You can stress the commit limit by running Testlimit with the -m switch, which directs it to allocate committed memory. The 32-bit version of Testlimit may or may not hit its address space limit before hitting the commit limit, depending on the size of physical memory, the size of the paging files and the current commit charge when you run it. If you're running 32-bit Windows and want to see how the system behaves when you hit the commit limit, simply run multiple instances of Testlimit until one hits the commit limit before exhausting its address space.
Note that, by default, the paging file is configured to grow, which means that the commit limit will grow when the commit charge nears it. And even when when the paging file hits its maximum size, Windows is holding back some memory and its internal tuning, as well as that of applications that cache data, might free up more. Testlimit anticipates this and when it reaches the commit limit, it sleeps for a few seconds and then tries to allocate more memory, repeating this indefinitely until you terminate it.
If you run the 64-bit version of Testlimit, it will almost certainly will hit the commit limit before exhausting its address space, unless physical memory and the paging files sum to more than 8TB, which as described previously is the size of the 64-bit application-accessible address space. Here's the partial output of the 64-bit Testlimit running on my 8GB system (I specified an allocation size of 100MB to make it leak more quickly):
And here's the commit history graph with steps when Testlimit paused to allow the paging file to grow:
When system virtual memory runs low, applications may fail and you might get strange error messages when attempting routine operations. In most cases, though, Windows will be able present you the low-memory resolution dialog, like it did for me when I ran this test:
After you exit Testlimit, the commit limit will likely drop again when the memory manager truncates the tail of the paging file that it created to accommodate Testlimit's extreme commit requests. Here, Process Explorer shows that the current limit is well below the peak that was achieved when Testlimit was running:
Because the commit limit is a global resource whose consumption can lead to poor performance, application failures and even system failure, a natural question is 'how much are processes contributing the commit charge'? To answer that question accurately, you need to understand the different types of virtual memory that an application can allocate.
Not all the virtual memory that a process allocates counts toward the commit limit. As you've seen, reserved virtual memory doesn't. Virtual memory that represents a file on disk, called a file mapping view, also doesn't count toward the limit unless the application asks for copy-on-write semantics, because Windows can discard any data associated with the view from physical memory and then retrieve it from the file. The virtual memory in Testlimit's address space where its executable and system DLL images are mapped therefore don't count toward the commit limit. There are two types of process virtual memory that do count toward the commit limit: private and pagefile-backed.
Private virtual memory is the kind that underlies the garbage collector heap, native heap and language allocators. It's called private because by definition it can't be shared between processes. For that reason, it's easy to attribute to a process and Windows tracks its usage with the Private Bytes performance counter. Process Explorer displays a process private bytes usage in the Private Bytes column, in the Virtual Memory section of the Performance page of the process properties dialog, and displays it in graphical form on the Performance Graph page of the process properties dialog. Here's what Testlimit64 looked like when it hit the commit limit:
Pagefile-backed virtual memory is harder to attribute, because it can be shared between processes. In fact, there's no process-specific counter you can look at to see how much a process has allocated or is referencing. When you run Testlimit with the -s switch, it allocates pagefile-backed virtual memory until it hits the commit limit, but even after consuming over 29GB of commit, the virtual memory statistics for the process don't provide any indication that it's the one responsible:
For that reason, I added the -l switch to Handle a while ago. A process must open a pagefile-backed virtual memory object, called a section, for it to create a mapping of pagefile-backed virtual memory in its address space. While Windows preserves existing virtual memory even if an application closes the handle to the section that it was made from, most applications keep the handle open. The -l switch prints the size of the allocation for pagefile-backed sections that processes have open. Here's partial output for the handles open by Testlimit after it has run with the -s switch:
You can see that Testlimit is allocating pagefile-backed memory in 1MB blocks and if you summed the size of all the sections it had opened, you'd see that it was at least one of the processes contributing large amounts to the commit charge.
Perhaps one of the most commonly asked questions related to virtual memory is, how big should I make the paging file? There’s no end of ridiculous advice out on the web and in the newsstand magazines that cover Windows, and even Microsoft has published misleading recommendations. Almost all the suggestions are based on multiplying RAM size by some factor, with common values being 1.2, 1.5 and 2. Now that you understand the role that the paging file plays in defining a system’s commit limit and how processes contribute to the commit charge, you’re well positioned to see how useless such formulas truly are.
Since the commit limit sets an upper bound on how much private and pagefile-backed virtual memory can be allocated concurrently by running processes, the only way to reasonably size the paging file is to know the maximum total commit charge for the programs you like to have running at the same time. If the commit limit is smaller than that number, your programs won’t be able to allocate the virtual memory they want and will fail to run properly.
So how do you know how much commit charge your workloads require? You might have noticed in the screenshots that Windows tracks that number and Process Explorer shows it: Peak Commit Charge. To optimally size your paging file you should start all the applications you run at the same time, load typical data sets, and then note the commit charge peak (or look at this value after a period of time where you know maximum load was attained). Set the paging file minimum to be that value minus the amount of RAM in your system (if the value is negative, pick a minimum size to permit the kind of crash dump you are configured for). If you want to have some breathing room for potentially large commit demands, set the maximum to double that number.
Some feel having no paging file results in better performance, but in general, having a paging file means Windows can write pages on the modified list (which represent pages that aren’t being accessed actively but have not been saved to disk) out to the paging file, thus making that memory available for more useful purposes (processes or file cache). So while there may be some workloads that perform better with no paging file, in general having one will mean more usable memory being available to the system (never mind that Windows won’t be able to write kernel crash dumps without a paging file sized large enough to hold them).
Paging file configuration is in the System properties, which you can get to by typing “sysdm.cpl” into the Run dialog, clicking on the Advanced tab, clicking on the Performance Options button, clicking on the Advanced tab (this is really advanced), and then clicking on the Change button:
You’ll notice that the default configuration is for Windows to automatically manage the page file size. When that option is set on Windows XP and Server 2003, Windows creates a single paging file that’s minimum size is 1.5 times RAM if RAM is less than 1GB, and RAM if it's greater than 1GB, and that has a maximum size that's three times RAM. On Windows Vista and Server 2008, the minimum is intended to be large enough to hold a kernel-memory crash dump and is RAM plus 300MB or 1GB, whichever is larger. The maximum is either three times the size of RAM or 4GB, whichever is larger. That explains why the peak commit on my 8GB 64-bit system that’s visible in one of the screenshots is 32GB. I guess whoever wrote that code got their guidance from one of those magazines I mentioned!
A couple of final limits related to virtual memory are the maximum size and number of paging files supported by Windows. 32-bit Windows has a maximum paging file size of 16TB (4GB if you for some reason run in non-PAE mode) and 64-bit Windows can having paging files that are up to 16TB in size on x64 and 32TB on IA64. Windows 8 ARM’s maximum paging file size is is 4GB. For all versions, Windows supports up to 16 paging files, where each must be on a separate volume.
Limit on x86 w/o PAE
Limit on x86 w/PAE
Limit on ARM
Limit on x64
Limit on IA64
Windows Server 2008 R2
Windows Server 2012
So if I have 1.5 GB of RAM in my HP laptop, what shall be the Custom initial size and maximum size, considering I want to get the best I can, out of my computer?
Thanks in advance..
As non-enterprise user:
If I have 8GB RAM, the 1.5 "rule" isn't necessary so I won't have to waste 12GB just for the pagefile, right?
I'd be better off setting it to 4GB (or 6GB if I run into problems)?
Also what about SSDs?
An MSDN blog advises to leave it on your SSD, as it is a perfect match:
Corsair however advises against it because of flash wear:
There it states you should move it to a spinning HDD as the performance decrease is negligible, but the SSD will be spared (which in turn should mean higher performance over a longer period of time as it isn't worn down as quick).
On the other hand, this is probably meant for 60 and 120GB drives that run out of usable flash earlier - I'd like to go with a 256GB, so shouldn't wear less of a problem here?
So...8GB RAM+256GB SSD = ?
Leave a 4 or 6GB pagefile on the SSD, move it to HDD(and possibly increase it to 12GB since enough space is available) or disable it completely(probably not so hot this idea)??
Any advise is welcome.
After reading this article, I have been running a machine with Windows 7 Ultimate x64 with 8GB of RAM and a 16MB paging file. The virtual lack of a paging file has shown me that it's not always necessary. I'm now back reading this article again contemplating the use of a reasonably small paging file for a Windows Server 2008 R2 machine with SQL Server 2008 R2 and 24Gb of RAM running under Hyper-V Server 2008 R2.
On my last machine, I put page file into a separate partition so that it wasn't included in my system partition backups, and I used FAT32 because I wanted to. I also decided that the file might as well be 4 gigabytes, because that was plenty, I could afford to give up the disk space, and on FAT32 it couldn't be any bigger.
Now I have a new machine to set up... and someone is telling me that I should still have at least 300 MB of pagefile on volume C. Is there a justification for that? Something to do with successful system crash dumps, maybe? I previously did what I liked and it seemed to work OK.
Robert: You can do that, but you need to set the DedicatedDumpFile option to redirect the kernel dump from c:\ to the pagefile on the other disk.
New behavior in Windows Vista and Windows Server 2008
In Windows Vista and Windows Server 2008, the paging file does not have to be on the same partition as the partition on which the operating system is installed. To put a paging file on another partition, you must create a new registry entry named DedicatedDumpFile. You can also define the size of the paging file by using a new registry entry that is named DumpFileSize. But we cannot dump to a spanned volume like a stripeset or RAID5.
To create the DedicatedDumpFile and DumpFileSize registry entries, follow these steps:
Click Start, click Run, type Regedit, and then click OK.
Locate and then click the following registry subkey:
On the Edit menu, point to New, and then click String Value.
In the details pane, type DedicatedDumpFile, and then press ENTER.
Right-click DedicatedDumpFile, and then click Modify.
In the Value data box, type <drive>:\<dedicateddumpfile.sys>, and then click OK.
Note <drive> is a placeholder for a drive that has enough disk space for the paging file, and <dedicateddumpfile.sys> is a placeholder for the dedicated file and the full path.
On the Edit menu, point to New, and then click DWORD Value.
Type DumpFileSize , and then press ENTER.
Right-click DumpFileSize, and then click Modify.
In the Edit DWORD Value dialog box, click Decimal under Base.
In the Value data box, type the appropriate value, and then click OK.
Note The size of the dump file is in megabytes.
Right-click DumpFile, and then click Modify.
In the Value data box, type <drive>:\<path>\Memory.dmp, and then click OK.
Note When the system crashes, this is the location where the memory dump file is created by using the dedicated file instead of by using the Pagefile.sys file.
Exit Registry Editor.
Restart Windows in order for your changes to take affect.
We are using software that is tied to 32-bit Windows operating system and we are running into major issues as we hit the PVM limits of the system. We already have the large address aware enabled to allocate 3GB address space for our application. Is there anything we can do to work around these limits (from the OS perspective), or are we stuck with our only options being upgrade to 64-bit OS or break-up the application into multiple processes?
Thanks in advance!
@DW: You could push it to 3300 with the /UserVA=3300 boot.ini switch (or bcdedit equivalent). The extra 300Mb isn't much more though (compared to /3Gb), and you'll put a lot of pressure of NPP and PP to fit in the remaining 700Mb. Note, if you use an x64 OS, you'll still only get 4Gb (with the x86 program). You need to get an x64 app on an x64 OS to really benefit from extra (>4Gb) memory.
@DW: If you want to use more than 3GB (3.3 with /USERVA= switch) from within x86 application, you should use EITHER multiple processes OR use some mechanism p. e. AWE to indirectly manage further memory pages.
(note: I'm not a Mark Russinovitch so I can be wrong :-)
I've read this series of articles and the Microsoft Windows Internals book. Thank you very much. I haven't found anything this informative since Peter Norton's books (way back when).
I have an issue I can't let go.
By design, 32-bit Windows total Virtual address space is 4GB. Thus for me - Pagefile Size + Physical Ram must be less than or equal to 4GB total. If the physical ram is 4GB and the pagefile is 2GB, then nothing in the pagefile is used by the system during normal operations. The pagefile would only be used for a crashdump. Of course, this "might" change if using an application that is AWE aware.
By design, x64 total Virtual address space is 16TB (8BG User/8GB System). As you say in this article, "Windows can having paging files that are up to 16TB in size". KB Article 294418 lists the maximum paging file size as 256TB. They assume the 16TB limit and allow for 16 pagefiles. This doesn't make any sense to me as the virtual space is limited by the operating system at 16TB. I've seen this 256TB pagefile notation several times. I can only think that you would need a pagefile over 16TB if x64 supported some sort of AWE mechanism. So far, I haven't heard about one.
If you are following up at this late date, I want to understand. Again Thanks.
According to this post, when running a large address aware 32-bit application in a 64-bit windows 7 the application could use the full 32-bit address space (4GB). This makes sense to me, however when running Visual Studio 2010 the limit is reached around 2GB.
Why is this happening? I checked with dumpbin /header and the devenv.exe it can handle large addresses (when run "editbin /largeaddressaware devenv.exe", don't know how it was before).
I would like to understand why.
The limit is per process and not per computer. Windows itself (the Kernel) seems to use AWE.