Thoughts from the EPS Windows Server Performance Team
In previous posts we've discussed the basics of memory management including an overview of kernel and user memory, pool resources as well as the /3GB switch. Continuing our discussion of memory management, we are going to examine an issue that we have been seeing more of on the Performance team - the Event ID 333 message. As system memory increases, and applications and roles of the server become more important, access to the disk becomes more competitive between applications and the operating system. As a result, you can start to see the Event ID 333 logged, and you are likely seeing other performance issues including server hangs, pauses, sluggish response times and more. So today we're going to discuss what the Event ID 333 is, possible causes and common solutions for both 32-bit and 64-bit systems.
The first thing to understand is what exactly an Event ID 333 is. The event ID 333 is a System event error log that occurs when the registry is unable to complete a flush operation to the disk. There are several reasons that this can fail and we'll discuss them below. So, what does the Event ID 333 look like? In the system event log, you see at least one, and more likely multiple instances of the following:
Event ID 333’s are new to Windows Server 2003 Service Pack 1 and are written to the system event log if the Operating System is not able to flush out or write to the registry hive. The symptoms that accompany an Event ID 333 can vary between server hangs, “Insufficient resources exist to complete the requested service” errors, SQL Databases stopping and starting, database queries are slow and sluggish operating system performance. The apparent slow server performance is why the Performance team is engaged on the majority of the issues. As mentioned previously, the Event ID 333 can occur on both x86 and x64 systems.
Now that we have a little background on what the Event ID 333 is and why it exists, we’re going to take a look at the causes. There are a number of different reasons for why a server is logging the Event ID 333's - the majority of the issues are caused by one of the following:
So - how do we go about resolving the Event ID 333 problem? Depending on the particular scenario that is causing your issue, then you should perform one of the following:
Maximize Kernel Memory and System PTE's:
A filter driver is preventing the registry from being flushed:
A quick note on filter drivers - removing or disabling filter drivers without understanding the impact to the system can result in unexpected system behaviors including system hangs or bugchecks. Examples of common programs that use filter and kernel drivers include Anti-virus software, Backup Software (including Microsoft Volume Shadow Copy) and Version-2 printer drivers. For information on how to temporarily disable filter drivers, refer to Microsoft KB Article 816071. Before disabling the filter drivers however, check to see if some of the 3rd party drivers are simply outdated. The easiest way to check on these 3rd party filter drivers is to use our MPS Reporting utility to capture this information.
"Lock Pages in Memory" user right:
On x64 systems, the likelihood of a server running out of kernel memory or System PTE's is far less than on an x86 system. However, we have seen the Event 333 error occur on x64 systems as well. Using the "!vm" debugger command when reviewing a memory dump of the server may indicate that the server is low on physical memory, even though performance monitor data indicates that there is plenty of available memory. The sample output below illustrates this:
15: kd> !vm *** Virtual Memory Usage *** Physical Memory: 2095394 ( 8381576 Kb) Page File: \??\C:\pagefile.sys Current: 4249600 Kb Free Space: 4154172 Kb Minimum: 4249600 Kb Maximum: 4249600 Kb Available Pages: 868200 ( 3472800 Kb) ResAvail Pages: 250 ( 1000 Kb)
********** Running out of physical memory **********
To work around this behavior on x64 platforms or on servers with 4GB or less of physical RAM use the following steps:
NOTE: By default, the Windows Operating system does not grant this user right to any accounts. The “Lock pages in memory” right is granted to the account used for SQL Services by the SQL 2005 RTM/SP1 Enterprise Edition install on 32bit systems. If you are using SQL Enterprise on 32-bit servers with more than 4GB of RAM, the Lock Pages in Memory right is needed. To help reduce the occurrence of the Event ID 333's on these systems, ensure the user account you are using for the SQL services is only used for SQL. Check the access right for “Lock pages in memory” and only list the account used for SQL. If System, or any other accounts are listed, remove them. For x64 systems, remove all accounts listed.
Well that does it for this post. As you can see the Event ID 333 has several causes and is usually one of many symptoms seen on servers experiencing performance problems. By troubleshooting and eliminating Event ID 333’s, the overall performance of the server should improve. Keep an eye on the Microsoft Knowledgebase - in the coming weeks, the majority of this information should be available in a KB Article.
- Aurthur Anderson
PingBack from http://videoxdrivers.net/2007/10/30/troubleshooting-event-id-333-errors/
I've seen event id 333 with a filter driver that keeps on writing a 64KB binary blob in the registry at every transaction - it just replaces the whole blob because the registry API doesn't allow flushing / writing part of the blob. Probably a poor design with the filter driver, but doesn't happen on all the systems. What's more interesting is that the system logs event id 333 every time the driver writes this blob, but the registry API doesn't fail - it succeeds and the driver continues with further writes.
Do you think something like this would happen because of the disk queue length? Are there any other parameters like the amount of data being written, the frequency of writes to the registry, any workarounds?
Thanks for the information. What could be happening is when the information is being written it coincides with the OS attempt to flush the registry to disk. Another workaround to the symptoms that I forgot to include in the above post is changing the RegistryLazyFlushInterval value to a higher value which increases time between flushes and allows other disk I/O more access to the disk:
--> Configure RegistryLazyFlushInterval in the registry to 60 seconds. (KB 317357)
Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\Configuration Manager]
Thanks for the feedback and let me know if you have any additional questions.
Let's assume that a driver is leaking memory. Beyond "make sure your drivers are updated" what specific functions would be used with Event Tracing for Windows to identify such a driver?
With respect to "For x64 systems, remove all accounts listed" (above), I see that http://support.microsoft.com/kb/918483/en-us says "To prevent SQL Server 2005 64-bit buffer pool memory from being paged out of physical memory, you can enable the lock pages in memory permissions."
Are you saying that it would be "best" to allow SQL Server's buffer pool to be paged memory? Isn't the salient purpose of SQL Server's buffer pool to reduce I/O demands?
Wouldn't allowing SQL Server's buffer pool to be paged out of memory (by removing “Lock pages in memory” for sqlservr.exe) thus be detrimental to SQL Server's performance?
If some other driver is leaking, wouldn't it be best (from the perspective of SQL Server's performance) to identify that leaking driver, first?
I've seen event 333 and at the same time i received the following message in Console:
Windows - Low On Registry Space
The system has reached the maximum size allowed for the system part of the registry. Additional storage requests will be ignored.
The size of system hive is about 5MB.
The size of Software hive is about 100MB.
The hive that doesn't flush is software.
The OS is Windows Server 2003 SP2.
I checked Disks performance, memory usage , filters driver installed and "Lock Pages in Memory" user right, i didn't find anything wrong.
I had the same issue and none of these worked for me. Simply rebooting the server got rid of the issue.
Although the errors were no longer appearing in the event log and I was able to do backup of the system using NTBACKUP, I found out that everytime the legato backup started it generated the same error in the event log and then the backup would fail. I took the following step and it fixed the issue for good now:
Uninstall legato networker from the server and reboot
Reinstall it after the reboot is complete
Legato clien (Legato Networker) was uninstalled and reinstalled priro to this but no reboot was done and that made all the difference.
Hope this helps you guys resolve your issue.
I've got this problem on a 64bit server running Exchange 2007 SP1. Rebooting keeps it at bay for a while then it comes back again. There is nothing on this server other than Exchange and a Symantec BackupExec 11d Remote Agent. Have no idea where to look next as none of the above helps...
One of my Windows 2003 with SP2 is having this event id 333 problem. Server might hang and have to reboot to clear up the problem.
It's a dell pe2800. All dell drivers, firmware and software are updated.
I'm not sure what is the Dell RegFix F6 do.
I have Backup Exec 10d, Mcafee VSE 8.0i patch 15.
We had the same error in our servers (DELL 6850 and 2850), which had 4X1GB and 2X2GB memory (added seperately).
Removing the 2x2GB memory solved the problem.
I have the exact same issue was Alex. Once the server reboots the issue comes back and either connection to AD causes all OWA access to cease, or the most recent event was all outbound email ceasing to work. Unfortunately I don't have a baseline to compare to so that won't work for me but what specific counters should I be looking at for memory/hard disk issues?
We have just encountered this error after a Symantec set of updates went through. We're running version 10.1.5.5000. A product update rolled through at midnight on April 25/26 and we have received this message ever since.
I have this messages across ALL siebel servers I support. Usually after reboot the system works fine. Then it deteriorates. When the problem surfaces, even the ping command gets the message "transmit failed error code 1450". Not to mention VNC and remote desktop stop working, and a valid address in the IE browser, e.g., http://good-ip gets the message "Page cannot be displayed" rather than "page under construction".
We have patch KB913446 installed (instead of KB898060), but no improvement can be seen. Actually, right after reboot, I already see the popup error in the env logs.
Would we expect to see Event ID 333 errors because of running out of Virtual Mem space?
I am seeing the same issue as Troy and Alex on my exchange 2007 sp1 server. I suspect it's related to the Page file size and Virtual Memory being insufficient. Server has 12gb of physical memory but only 2-4gb set for page file. I'm going to add a 18gb page file to another partition (G:), remove the page file from C: and give it a reboot this evening. If it works, I'll let you know.
I can be reached at email@example.com
I'm seeing this problem on our backup server (using Synamtec Backup Exec 11d) since we migrated from a WFW/Novell to an AD based setup. I noticed that no account has been given the "Lock pages in memory" right. As BackupExec uses SQL, should the SQL account be in there or not?