Thoughts from the EPS Windows Server Performance Team
Useful Microsoft Blogs
In our last post, we looked at some common memory issues and how to troubleshoot them. Today we're going to go over excessive paging and memory bottlenecks.
We've talked about issues with the page file in several posts - something to bear in mind is that although you want to have enough RAM to prevent excessive paging, the aim should not be to try to prevent paging activity completely. Some page fault behavior is inevitable - for example when a process is first initialized. Modified virtual pages in memory have to be updated on the disk eventually, so there will be some amount of Page Writes /sec. However, when there is not enough RAM installed, there are two issues in particular that you may see - too many page faults, and disk contention.
Let's start with Page Faults. Page faults are divided into two types, soft and hard. A page fault occurs when a process requests a page in memory and the system cannot find the page at the requested location. If the requested page is actually elsewhere in memory, then the fault is a soft page fault. However, if the page has to be retrieved from the disk, then a hard fault occurs. Most systems can handle soft page faults with no issues. However, if there are lots of hard page faults you may experience delays. The additional disk I/O resulting from constantly paging to disk can interfere with applications that are trying to access data stored on the same disk as the page file. Although high page faults on a system is a fairly straightforward issue, it requires some extensive data gathering and analysis in Performance Monitor. The counters below are the important ones when troubleshooting a suspected page fault issue:
Remember that since the operating system has to write changed pages to disk, that there will be page write operations occurring. However, the Page Reads /sec which indicates the number of hard faults is extremely sensitive to situations with insufficient RAM. As the value of Available Bytes decreases, the number of hard Page Faults will normally increase. The total number of Pages /sec that can be sustained by the system is a function of the disk bandwidth. This does however mean that there is no simple number to determine whether or not the disks are saturated. Instead you have to identify how much of the overall disk traffic is being caused by paging activity.
Another indicator of a memory bottleneck is that the pool of Available Bytes is depleted. Page trimming by the Virtual Memory Manager is triggered when there is a shortage of available bytes. What page trimming does is attempt to replenish the pool of available bytes by identifying virtual memory pages that have not been referenced recently. When page trimming is effective, older pages trimmed from the process working sets are not needed again soon. Trimmed pages are marked in transition and remain in RAM for a period of time to reduce the amount of paging to disk that occurs. However, if there is a chronic shortage of available bytes, then page trimming is less effective and the result is that there is more paging to disk. Since there is little room in RAM for the pages that are marked in transition, if a recently trimmed page is referenced again it has to be accessed from disk as opposed to RAM. The more severe the bottleneck, the more often the page file is updated - which interferes with application-directed I/O operations on the same disk.
Before we wrap up, let's quickly discuss the guideline listed above when looking at the Memory \ Available Bytes counter. Normally if the Available Bytes is greater than 5% of the installed RAM consistently, then you should be in decent shape. However, there are some applications that can manage their own working sets - IIS6, Exchange Server and SQL Server. These applications interact with the virtual memory manager to increase their working sets if there is memory available and should trim their working sets when signaled by the operating system. The applications rely on RAM-resident cache buffers to reduce the I/O to disk. Thus, RAM will always look full as a result.
And on that note, we will wrap up our two-part Overview of Troubleshooting Memory Issues. Until next time ...
- CC Hameed
The Windows Server Performance Team has published part 2 of troubleshooting memory issues . This one
Segue em seguida a continuidade do overview sobre troubleshootings de memórias apresentado por CC Hameed
Picking up on your comments about soft page faults, I wonder if you've seen that memory allocation can cause >100,000 soft faults per second, which can really slow a system down.
The issue occurs if a program is allocating and deallocating over 1MB of memory, then the heap manager will return this memory to the free list and so Windows will zero it out in the kernel next time you use it, on a page-by-page basis.
This could be considered an application problem rather than a Windows problem, though personally I feel Windows should allow the limit to be adjusted.
I am seeing the same thing with soft page faults. I am glad to see someone else has noticed this problem. It will consume about 25-50% of the cpu time (as seen by the kernel time plotted in Task Manager). This is definitely a problem with the OS as far as I am concerned.
I will get up to 200,000 soft page faults per second in a test case I put together today. Just because I am malloc/free'ing memory for intermediate results. The other way one can see this occuring is by noticing the constantly adjusting working set size displayed in Task Manager while the Maximum Memory sizes never change. The OS is doing a lot of work and consuming a lot of resources in order to get in my way.
We are trying to use a Windows system for some performance computations. This is causing us to not meet the systems full potential.
Do you know if there are any settings to make the OS a little lazier on my behalf?
i am confised that how ram cache physical memory related to each other & how they work pls help meeeeeeeeeeeeeeeeeeeeeeeee