Kevin Holman's System Center Blog

Posts in this blog are provided "AS IS" with no warranties, and confers no rights. Use of included script samples are subject to the terms specified in the Terms of UseAre you interested in having a dedicated engineer that will be your Mic

WMI leaks memory on Server 2008 R2 monitored agents

WMI leaks memory on Server 2008 R2 monitored agents

Rate This
  • Comments 27

 

Here is something that a customer brought to my attention, and is probably impacting you already.

 

They noticed that WMI on some of their Server 2008R2 monitored agents was consuming a large amount of memory – and continually increasing.  I started tracking this in SCOM by writing a rule to collect the Process\Private Bytes of all WMI processes (WmiPrvSE*) to check.

Sure enough – a handful (but not all strangely) of my Windows 2008 R2 monitored servers are exhibiting this behavior.  Below is a graph where see can see most processes are consuming ~20MB or less, but some are steadily increasing – consuming 400MB of RAM or more.

 

image

 

If it goes long enough – occasionally you might also see this in your event logs:

Log Name:      Application
Source:        Application Error
Date:          3/10/2010 4:24:35 PM
Event ID:      1000
Task Category: (100)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      VS5.opsmgr.net
Description:
Faulting application name: wmiprvse.exe, version: 6.1.7600.16385, time stamp: 0x4a5bc794
Faulting module name: ole32.dll, version: 6.1.7600.16385, time stamp: 0x4a5be01a
Exception code: 0xc0000005
Fault offset: 0x0000000000039389
Faulting process id: 0x180
Faulting application start time: 0x01cabfafa91cc252
Faulting application path: C:\Windows\system32\wbem\wmiprvse.exe
Faulting module path: C:\Windows\system32\ole32.dll
Report Id: b45b5a1d-2c93-11df-ac21-001b213a78be

 

It turns out there is a hotfix for Windows 2008 R2 – which addresses a possible leak when an application queries the Win32_Service class frequently.  A monitoring tool would do this – and therefore OpsMgr can accelerate this leak in the OS.

http://support.microsoft.com/kb/981314

 

This hotfix addresses this issue – I applied it to my servers – and they are no longer leaking memory from the WMI process.

 

Capture

 

I am adding this hotfix to my recommended hotfixes link, in the OS section.

 

http://blogs.technet.com/b/kevinholman/archive/2009/01/27/which-hotfixes-should-i-apply.aspx

 

 

These are some signs that this might be impacting you in OpsMgr:

 

You might get some alerts in the console like the following:

Workflow Runtime: Failed to run a process or script

The process started at 1:22:12 AM failed to create System.Discovery.Data. Errors found in output:

C:\Program Files\System Center Operations Manager 2007\Health Service State\Monitoring Host Temporary Files 5\4235\AlertUpdateConnectorDiscovery.vbs(16, 1) SWbemObjectSet: No more threads can be created in the system.

Command executed: "C:\Windows\system32\cscript.exe" /nologo "AlertUpdateConnectorDiscovery.vbs" {A7504CAE-3EA5-5B1F-CDA4-A4593E4D85FD} {F8AEF188-D663-9719-3FD8-94B2AF6F0726} SQL2V1.opsmgr.net
Working Directory: C:\Program Files\System Center Operations Manager 2007\Health Service State\Monitoring Host Temporary Files 5\4235\

One or more workflows were affected by this.

Workflow name: AlertUpdateConnector.ConnectorDiscovery
Instance name: SQL2V1.opsmgr.net
Instance ID: {F8AEF188-D663-9719-3FD8-94B2AF6F0726}
Management group: PROD1

Or:

Workflow Runtime: Failed to run a WMI query

Object enumeration failed

Query: 'SELECT DisplayName, Name, StartMode FROM Win32_Service WHERE Name="ClusSvc" and StartMode!="Disabled"'
HRESULT: 0x80041006
Details: Out of memory

One or more workflows were affected by this.

Workflow name: Microsoft.Windows.Cluster.Service.Discovery
Instance name: SQL2CLN1.opsmgr.net
Instance ID: {90476733-8FA9-1718-152C-932FF9AB9BC6}
Management group: PROD1

Or:

Workflow Runtime: Failed to run a WMI query

Object enumeration failed


Query: 'SELECT NumberOfProcessors FROM Win32_ComputerSystem WHERE DomainRole >1'
HRESULT: 0x800705af
Details: The paging file is too small for this operation to complete.

One or more workflows were affected by this.

Workflow name: System.Mom.BackwardCompatibility.Computer.Server.DiscoveryRule
Instance name: SQLDB1.opsmgr.net
Instance ID: {AF7C2749-FF52-E354-EEAE-8CFCA3541607}
Management group: PROD1


The details of the script or discovery or workflow are irrelevant.  What is relevant here is seeing the messages “No more threads can be created in the system” and Out of memory” and “The paging file is too small for this operation to complete”.

Those are tell-tale signs of a memory leak or memory pressure, and in this case caused by WMI.

 

Sure enough – when I check this system, I can easily see there is an issue:

 

image

 

If you are running Server 2008R2 on ANY monitored system, it is highly likely that you need to apply this hotfix. 

I recommend it across the board for all Windows 2008 R2 monitored agents, until Windows Server 2008R2 SP1 releases, or something supersedes this.

Comments
  • Hello Kevin & Wilson,

    We are seeing a similar issue with our ConfigMgr Central & Primary Site servers that run on Server 2008 with SP1 and NO R2.  We have had CM on these boxes for a while, however, we started to get these WMI errors shortly after installing OpsMgr agents on these boxes (as far as I can tell.  I could be wrong but no other significant change has been made to these boxes).  We get errors similar to this listed below and when we try to connect to the CM Site database through the admin console, it fails.  Basically, the only way we get CM to work again is by rebooting.  One of the server where we are seeing this issue, I have put that box in Maintenance mode in OpsMgr, trying to see if that would help...  Any suggestions would greatly be appreciated!  Manoj

    Log Name:      Application

    Source:        Perfstat_for_Windows

    Date:          10/1/2010 2:40:14 PM

    Event ID:      7015

    Task Category: None

    Level:         Error

    Keywords:      Classic

    User:          N/A

    Computer:      SITE_SERVER_NAME

    Description:

    Exception - error during XML_WMICAT. :

    Invalid query

    memfree,Win32_PerfRawData_PerfOS_Memory,free,AvailableKBytes

  • Hi Kevin. Thanks for the good read regarding these WMI issues. I'm with some of those who have commented - I see these issues on a non-R2 server. I also read somewhere that some have applied this fix to non-R2 installations and it has stopped issues related to leaks caused by WMI. We are running an HP ML350 G6 with SBS08. A bunch of other software have started intermittently failing due to lack of resources, and only a reboot can get things back to normal. This usually only lasts a week before needing another reboot.

    If anyone can help point me in the right direction I'd be most grateful. Thanks for the time!

  • This fix CANNOT be applied to a non-R2 server.  It wont work, wont install...  This is for a VERY specific issue that presents itself only on Win7/2008R2.

    There are other 2008 WMI hotfixes - but you need to do much more research as to why your system becomes unstable.

  • Hi, we have this problem on some of our 2008 R2 Domain Controllers.

    The thing is that if we apply the hotfix the DC will hang in about 2 days with all sort of alarm and errors. it will all end up with that you cant logon to the server and the only way out is to reboot it. And when we roll back the the hotfix everything gos back to normal, that will say stable DC but with WMI processes that leaks memory.

    So this is just a heads up for all you.

  • @ Patrik -

    I have not seen that behavior and I have LOTS of customers running this hotfix.  Very interesting.  Have you opened a case on this?  

    This hotfix is also included in Windows 2008 R2 SP1 - so SP1 will be the recommendation moving forward.

  • Just wanted to add to the people that are seeing this with non-R2 servers. I am seeing it on some server 2008 with SP2 machines.

  • Frustratingly, whilst this hotfix solves the problem of WMI leaking uncontrollably, it seems to do so by simply setting a threshold amount of memory (somewhere around 600Mb) and simply terminating and restarting WMIPrvSE when this is exceeded.   We're using WMI to monitor MSMQ, and get alerts and see a drop out when the monitoring goes to grab a WMI counter and gets "Query was syntactically invalid" as a response.  Moments later, the Working set for WMIPrvSE will drop from 550Mb to 50Mb, and everything will return to normal.  I'm not knocking the hotfix (who wants a server that needs rebooting all the time) but a better fix than this was surely possible?

  • I blogged about this a while back when we found it. The details are here:

    brooke.blogs.sqlsentry.net/.../win32service-memory-leak.html

    It wasn't initially slated to be included until the first service pack, but after pushing hard for them to release a hotfix they did.

  • We have this on a couple of 2008 R2 Server WITH SP1. And the hotfix is not applicable. Thought they fixed it in SP1 ??

  • Hans - you are spot on.  There is another leak, however, it is not nearly as aggressive as this one.  I have been tracking it for a long time, however, we aren't getting any cases on it, and most customers don't notice it because it isn't very aggressive... in a month it doesn't generally leak more than 300-400MB of memory, depending on the server role.  I haven't completed all my testing on this so I haven't published anything about it, but I have observed the same.  It will take a customer reporting this and working with PSS to get some resolution there.

  • Thanks for posting this! I just ran into a situation where WMI was crashing with this error while a third-party application was initiating a SQL backup (causing the backup to fail of course!) and this pointed me in the right direction to resolve the issue.

  • Kevin, was there any update on this? You are right with the fact that it is still there and not aggressive as the first, but it is causing some issues for us on few of our systems.

Page 2 of 2 (27 items) 12
Leave a Comment
  • Please add 5 and 3 and type the answer here:
  • Post
Search Blogs