Thoughts from the EPS Windows Server Performance Team
Useful Microsoft Blogs
Good morning AskPerf! One of our large call generators that we still see is with PTE depletions on Windows Server 2003 Terminal Servers (x86). Scaling your W2K3 Terminal Server can be challenging as the load increases with additional users and the limits of the x86 Memory Architecture. Often Terminal Servers run out of kernel resources due to load instead of any type of memory leak. One of those resources is depleting System PTEs (Page Table Entries). So, what are System PTEs?
System PTEs are a kernel memory structure used to map the following:
Other operating system internals that use PTEs:
NOTE The normal range of PTE’s are as follows (can vary from these values):
Free System PTE’s can be viewed by adding the counter “Memory\Free System Page Table Entries” in Perfmon. Free System PTE’s should be kept above 10,000.
When a system starts running low on free PTEs, < 5000 , unusual things can start occurring on the server with little to no warnings. First, applications or drivers may get memory denied type errors. Secondly, applications or the Server itself might hang if threads cannot be created. Another symptom could be errors in the event logs, similar to the following:
Event Type: Information Event Source: dmio Event Category: None Event ID: 29 Date: MM/DD/YYYY Time: HH:MM:SS AM/PM User: N/A Computer: Computer_Name Description: dmio: Harddisk9 read error at block 445136247: status 0xC000009A
Event Type: Information Event Source: dmio Event Category: None Event ID: 30 Date: MM/DD/YYYY Time: HH:MM:SS AM/PM User: N/A Computer: Computer_Name Description: dmio: Harddisk2 write error at block 411779656: status 0xC000009A
0xC000009A is a Windows status code that translates to STATUS_INSUFFICIENT_RESOURCES.
A program is composed of 1 or more threads. Each of those threads has a stack (the kernel mode part of a user mode program) and each of the threads use some PTE’s. Normal threads use 4 while UI threads (threads related to displaying info on the screen) use 16. Those can end up using a majority of the PTE’s on a system. Let’s consider the following example:
Customer was previously able to have 50 logged on users to their terminal servers. Now, they are only able to have ~35 users before new users start receiving errors when connecting to the TS.
Let’s look at the relevant data to determine the cause of the problem. Perfmon data captured for several hours shows how load and memory resources vary.
The screenshot below highlights the number of sessions on the system. We start with 36 sessions and get as high as 38 sessions until the end, where we fall down to 5 user sessions.
Now let’s look at free PTEs. Free System PTEs start at 22,000 and get as low as 4000 as more users log in. As users start logging off, we shoot up to ~145,000 free System PTEs. The fact that Free System PTEs increase shows that this is a load issue and not a Memory leak.
Finally, we see threads that have an inverse pattern of Free System PTEs. When we get above 11,000 threads, PTEs decrease. We saw a maximum of 948 processes running at the time of this capture.
Note Most Windows 2003 32 bit systems will only get to 11,000 – 14,000 threads before PTE’s are depleted
11962 threads / 948 processes = 12 threads per process is a normal number (this can vary with different types of applications)
11962 threads / 38 sessions = 314 threads per user session
How do we troubleshoot this type of problem? Glad you asked!
First, we need to determine what the number of free PTEs at boot. I gave ranges above, but in general we should look for at least 170,000 free PTEs when the machine starts up. If your server is not around this number, then the following could be at play here:
The Server has greater than 16 GB of memory. I have never seen a 32 bit terminal server that could use more than 16 GB of memory before running out of some type of kernel memory resource. Having more memory takes additional kernel memory for the PFN database which is used to map physical memory addresses. Either physically remove the memory or use /maxmem=16200 in the boot.ini file to limit memory to 16 GB.
The Server is using the /3GB switch in the boot.ini. Using /3GB (4Gig tuning) on a terminal server will not work well. Do not try to run programs that need 3GB of memory on a terminal server. Move them to another server. While you can increase the amount of PTE’s by using /USERVA=3030 (or a value down to 2800) in the boot.ini, other kernel memory pools will be limited and the system will not scale.
Complex hardware including multiple NIC’s and HBA’s (reserves addresses at boot). Some NICs and HBAs can reserve a lot of virtual addresses. This will reduce the amount of kernel memory including PTEs. Disable or remove any unneeded hardware from the system.
If the number of free PTEs at boot look normal, then decline with load, the following should be reviewed:
How many processes is each user running? Lite users may only be running 4 while heavy users may be running 20 or more. A few heavy users can make the difference between getting 50+ users on a system and only getting 30 users.
Are there any programs that can be removed? It is not uncommon for new programs to get added but old programs are still installed. Try removing those that are not needed.
Is there an individual program that is creating a lot of threads? Threads per process can be viewed by adding the thread column in task manager. I had one customer that had a custom program that was creating over 350 threads per instance - almost all of them UI threads (350 threads * 16 PTE’s = 5600 PTE’s per instance of this program).
Note On average ½ of threads will be UI threads. So on a system with 10000 threads they will use the following:
5,000 threads * 4 PTE’s + 5,000 threads * 16 PTE’s = 100,000 PTE’s used just for thread stacks
After looking at both the number of the PTEs being created at boot and removing any unneeded processes, then you will have optimized the number of Free System PTEs. If free PTEs continue to decrease into the danger area, then the Server load will need to be reduced by limiting the number of logged on users. If you still experience issues, then an x64 OS could be in your future.
In some cases, it is also good to look at "SystemPages" value under HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management. Default should be 0, but i had one case in the past where this was set to a number like 50000 by someone on both nodes in one Itanium cluster. When running under load, system was not able to create any new threads, perform failover etc.
Critical information in simple words. The limitation and optimized utilization of the resources will let you know whats going in and out of the System. Thanks for the article.