Thoughts from the EPS Windows Server Performance Team
Back In October, we wrote a post regarding the Event ID 333 Errors and how to troubleshoot them. At the end of that post, we mentioned the "Lock Pages in Memory" user right. One of the things we mentioned was that if you have an x64 system, you should not enable this right. Over the last few months, we have seen a number of cases where x64 SQL servers have been exhibiting performance issues - either within the Operating System or within SQL. It's not always the Event ID 333 messages that we've been seeing either, so today we're going to give you the lowdown on Lock Pages in Memory - and whether or not you really need it. So, let's get started ...
The scenarios listed below are some typical scenarios that we work on with customers on a daily basis. For each of these scenarios, we are going to use a system with 8 processors, 32GB of RAM installed, and a 64-bit OS with 64-bit SQL installed:
Before we start digging into the troubleshooting, let's quickly review some important points regarding these scenarios. In the first two instances, the most likely cause of the problem is incorrect scaling or configuration. Moving on, the Event ID 333 indicates that we are unable to flush the registry, or perform read / write operations to the registry. As we mentioned in our Event ID 333 post, there are several possible causes. One common cause is memory pressure - you may also see Event ID 2019's or 2020's indicating a resource depletion or you may have a condition with low PTE's. However, this is not always the case, and we'll get into this a bit later on in the post. Something interesting to note is that when you are in a situation where you are experiencing Event ID 333, a memory dump may not be of any use - because we can't see what caused the Event 333. The only way to clear the event and restore operations is to reboot the server.
OK, so why do we start running into these issues? SQL Server listens to Physical Memory Notifications to trim its working set. In these scenarios, the trimming is done by the working set manager which has its own set of rules for how and when to trim memory. On Windows 2000 and Windows Server 2003, if the currently available resources are insufficient to satisfy a new request for resources made by a driver or another application, the working set manager will trim the working set size of all applications to satisfy the request. Small but repeated working set trimming will eventually result in significant portion of SQL Server being paged out. To compensate for this SQL queries its own working set on a regular basis and triggers an internal indication that physical memory is low when a significant portion of its memory allocations have been trimmed. The Catch-22 here is that although SQL can decommit memory, the pages that are decommitted may not be the same ones that were paged out.
However, if the request for resources is a request for contiguous memory, then the working set manager uses aggressive trimming that results in all applications being forced to empty their working sets. This is a key difference to bear in mind based on what type of request is made. In Windows Server 2008, however we efficiently (and gracefully) swap out the memory instead of emptying all the working sets - which results in better overall and consistent performance.
Switching our attention to SQL server specifically, there are a couple of memory settings within SQL that can be configured - Max and Min Server Memory. When configuring the Max Memory setting (and Min Memory setting) for SQL Server, it is important to realize that these settings only affects the SQL Buffer Pool. They do not control the overall amount of physical memory consumed by SQL server. The buffer pool is used to cache data pages from the disk and to satisfy the memory requirements for the query processor and for other internal data structures. The size of a SQL Server database page is 8KB. Memory is committed and decommitted in 8KB chunks. So, each time a query is made, or data mining is performed that memory allocation occurs in 8KB chunks. However, the problem is that these chunks are not allocated in a contiguous fashion - the image below illustrates how this might look:
SQL Server itself may require anywhere from approximately 3MB to 650MB of Physical Memory. We're talking about the actual SQL Server process itself here (SQLSERVR.EXE). The actual number required is a function of the size and number of databases - in other words, the amount of data. One thing to note here is that if you were to look at this in a Performance Monitor, both the Buffer Pool and the SQL Server process itself are all displayed under the SQLSERVR.EXE process.
Now that we've set the stage we're ready to start digging into the scenarios that we outlined above. Remember that our sample system has 32GB of RAM installed, 8 processors, and is running a 64-bit OS with 64-bit SQL installed. So ...
Scenario 1: SQL Maximum Memory is set to 30GB. The symptoms are that SQL Working Set is getting trimmed during large file copies and also during the backup process. When looking at this problem, a couple of things spring to mind. From a simplistic view, SQL Server has a maximum memory setting of 30GB, leaving 2GB for the Operating system. On the surface, this seems reasonable, right? Well, the problem is that there is a lot more to consider than just the Operating System. If you recall our post in May 2007 regarding SQL and the Working Set, we broke down some of the math behind the calculations:
Uh-oh. We haven't allocated a single Megabyte of RAM for the Operating System and we're already at 32GB. If you consider that the absolute minimum required just to install the Operating System is 128MB, then it's pretty clear that we already have a resource problem. So given that we already have a resource crunch, it's pretty easy to see what is going to happen. For small requests, SQL will get paged out in small chunks - however, when we perform a large file copy, there clearly are not going to be sufficient resources to satisfy this request, and the aggressive working set trimming takes place. Now let's move on to ...
Scenario 2: SQL Maximum Memory is set to 28GB. SQL is getting paged out during copy operations and when compression software is running its processes. So we have backed off our Maximum Memory from 30GB to 28GB. We should be fine now, right? Unfortunately, now we are seeing SQL getting paged out. Let's go back to our math:
Well, the problem with this series of calculations is that it still omits a few key items - like all of the other applications that are running on the system, including some pretty standard applications such as:
And of course, since this is a SQL Server, there are SQL specific tools to consider, including:
The same basic tenets apply as in the first scenario, we just have a little bit more room to maneuver in terms of memory allocation. However, the bottom line is still that we do have a resource crunch based on the current configuration. And with that, it's on to ...
Scenario 3: SQL Maximum Memory is set to 30GB. This is on a SQL Enterprise Edition, and the "Lock Pages in Memory" right is configured. The Event Logs are filling up with Event ID 333. Given what we've talked about in Scenario 1, we already know that we have a resource crunch. Here's where the "Lock Pages in Memory" feature can get us into trouble. "Lock Pages in Memory" is an Operating System feature that allows any application that can access the API to take its buffer pool and lock it into physical memory thus preventing it from being paged out. This results in quick access in the buffer pool region. However, SQL does this by default – unless the Operating System requires more memory than what is available without having to lean on the buffer pool. If the "Lock Pages in Memory" right is not configured, then the SQL Buffer Pool is paged out or trimmed, depending on the type of request. If "Lock Pages in Memory" is configured however, the buffer pool is protected and the working set manager is unable to free up resources for other operations - resulting in the Event ID 333 messages being generated, along with a significant increase in paging for everything other than the SQL buffer pool.
So - how do we address these issues. If you recall from our SQL and the Working Set post, we talked about some baseline numbers to start with and then gathering a baseline Performance Monitor to do the fine-tuning. For this example, since we don't really know how much memory the "other" applications will need, we're going to build in a little cushion for ourselves while we do our baselining. Remember that you should remove the "Lock Pages in Memory" right when baselining the server so that you get an accurate picture of what is going on.
However, this does not really leave us much of a cushion if the applications require more than 3GB. So we're going to change up the model slightly.
Once we have our baseline data (see below for the specifics), we can reconfigure the Max Memory Setting to give more memory to SQL.
Now that the baseline configuration is set up, it's time to gather the Performance Monitor data. The two primary counters to look at are:
Performance object: Process Counter: Private Bytes Instance: sqlservr
Performance object: Process Counter: Working Set Instance: sqlservr
The Private Bytes counter measures the memory that is currently committed. The Working Set counter measures the physical memory that is currently occupied by the process. 64-bit editions of SQL Server 2005 also use the following performance counter to expose the memory that the buffer pool allocates:
If the instance of the 64-bit edition of SQL Server 2005 is a named instance, the name of the performance object is MSSQL$InstanceName: Memory Manager. If the value of the Working Set counter is less than the value of the Total Server Memory(KB) counter, at least some memory that is a part of the buffer pool has been trimmed from the SQL Server working set. After you assign the Lock pages in memory user right and restart the SQL Server service, the buffer pool of the SQL Server process still responds to memory resource notification events and dynamically grows or reduces in response to these events. However, you cannot see memory allocations for the buffer pool that are locked in memory in the following performance counters:
The Private Bytes counter and the Working Set counter in Performance Monitor.
The Memory Usage column on the Processes tab in Task Manager.
After these pages are locked, these performance counters represent the memory allocations inside the SQL Server 2005 process when those allocations do not use the buffer pool. The Total Server Memory(KB) counter of the SQL Server:Memory Manager performance object accurately represents the memory that is allocated for the buffer pool.
As we mentioned before, Windows Server 2008 improves the contiguous memory allocation mechanism. The improvement lets Windows Server 2008 reduce the side effects of paging out the working set of applications when new memory requests arrive. One thing to note though is that you must install SQL Server 2005 Service Pack 2 (SP2) or a later SQL Server 2005 service pack if you are running on a Windows Server 2008 machine.
So with all of the information above, the original question still stands: "Lock Pages in Memory" .. do you really need it? Once you have configured your server based on the results of the Performance Monitor analysis, that decision really rests with you.
- Dane Smart
I have 32GB Installed.
I have 64-bit OS. (Windows Server 2003R2)
I have 64-bit SQL Server 2005 9.0.3042
I have minimum server memory 2048
I have maximum server memory 28669
My sql server will only use 650MB of memory.
I have no idea why.
Another excellent post, thank you. I do wish you guys would get together with the SQL guys though as I figure most DBA's probably won't view your blog, I shall promote your blog again of course.
(Answer to jerry lagro)
The 650MB you see sql server using is presumably from its private bytes.
The larger dedicated /lock pages in memory usage is not reflected in the private bytes counter
(nor in the working set counter)
Is there any way sysinternals "Process Explorer" can help out in figuring out memory consumption on this issue ? (or another sysinternals tool)
Run Process Explorer the current one will work for x86 and x64.
Go to View, Select Columns, then choose the Process memory tab. Now select Page Faults, Private Bytes, Virtual Size.
You may also want to go back and choose the .Net tab and select "Contentions" (located at the bottom)
This tool has really "enlightened" me on the fact that my SQL environments really need to be tuned better.
We have a 32-bit system, where for a variety of reasons we are removing AWE but are enable to disable lock pages in memory as remove appears to be greyed-out in the Group Policy.
Sorry I meant unable to disable
Are you advocating that we never grant LPIM permissions on x64 machines or only that we remove it if we run into Event ID 333 errors? All of your sample scenarios seem to depend on the user not allocating enough memory for the OS and non-buffer pooled SQL processes.
For example, here is the profile of one of my production servers:
Total memory: 128 GB
Max Memory setting: 120 GB
LPIM set: yes
No SSIS, no SSAS, no Full-text, no CLR, no non-essential applicaitons or services running.
We do not have issues with paging or Event ID 333 errors. Would you still recommend that we remove the LPIM permissions?
For reference, I am trying to defiitively determine whether or not our internal best practice documentation should suggest using or not using LPIM.
You make the statement "One of the things we mentioned was that if you have an x64 system, you should not enable this right.". However in many places (see below) it mentions that if SQL Sever is granted this right on 64 bit systems then SQL Server will use AWE API calls which have been shown to be more efficient for memory allocation/deallocation than the standard API memory allocation calls. So it seems that enabling this right allows more effficient calls and speeds up SQL Server and so long as you dont over-commit the memory as you explain above then there seems to be an advantage all around. The references are:
Page 436 of SQL Server 2008 Administration
Page 151 of SQL Server 2005 Performance Tuning
Page 36 of SQL Server 2008 Internals