Kevin Holman's System Center Blog

Posts in this blog are provided "AS IS" with no warranties, and confers no rights. Use of included script samples are subject to the terms specified in the Terms of UseAre you interested in having a dedicated engineer that will be your Mic

WMI leaks memory on Server 2008 R2 monitored agents

WMI leaks memory on Server 2008 R2 monitored agents

  • Comments 36
  • Likes

 

Here is something that a customer brought to my attention, and is probably impacting you already.

 

They noticed that WMI on some of their Server 2008R2 monitored agents was consuming a large amount of memory – and continually increasing.  I started tracking this in SCOM by writing a rule to collect the Process\Private Bytes of all WMI processes (WmiPrvSE*) to check.

Sure enough – a handful (but not all strangely) of my Windows 2008 R2 monitored servers are exhibiting this behavior.  Below is a graph where see can see most processes are consuming ~20MB or less, but some are steadily increasing – consuming 400MB of RAM or more.

 

image

 

If it goes long enough – occasionally you might also see this in your event logs:

Log Name:      Application
Source:        Application Error
Date:          3/10/2010 4:24:35 PM
Event ID:      1000
Task Category: (100)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      VS5.opsmgr.net
Description:
Faulting application name: wmiprvse.exe, version: 6.1.7600.16385, time stamp: 0x4a5bc794
Faulting module name: ole32.dll, version: 6.1.7600.16385, time stamp: 0x4a5be01a
Exception code: 0xc0000005
Fault offset: 0x0000000000039389
Faulting process id: 0x180
Faulting application start time: 0x01cabfafa91cc252
Faulting application path: C:\Windows\system32\wbem\wmiprvse.exe
Faulting module path: C:\Windows\system32\ole32.dll
Report Id: b45b5a1d-2c93-11df-ac21-001b213a78be

 

It turns out there is a hotfix for Windows 2008 R2 – which addresses a possible leak when an application queries the Win32_Service class frequently.  A monitoring tool would do this – and therefore OpsMgr can accelerate this leak in the OS.

http://support.microsoft.com/kb/981314

 

This hotfix addresses this issue – I applied it to my servers – and they are no longer leaking memory from the WMI process.

 

Capture

 

I am adding this hotfix to my recommended hotfixes link, in the OS section.

 

http://blogs.technet.com/b/kevinholman/archive/2009/01/27/which-hotfixes-should-i-apply.aspx

 

 

These are some signs that this might be impacting you in OpsMgr:

 

You might get some alerts in the console like the following:

Workflow Runtime: Failed to run a process or script

The process started at 1:22:12 AM failed to create System.Discovery.Data. Errors found in output:

C:\Program Files\System Center Operations Manager 2007\Health Service State\Monitoring Host Temporary Files 5\4235\AlertUpdateConnectorDiscovery.vbs(16, 1) SWbemObjectSet: No more threads can be created in the system.

Command executed: "C:\Windows\system32\cscript.exe" /nologo "AlertUpdateConnectorDiscovery.vbs" {A7504CAE-3EA5-5B1F-CDA4-A4593E4D85FD} {F8AEF188-D663-9719-3FD8-94B2AF6F0726} SQL2V1.opsmgr.net
Working Directory: C:\Program Files\System Center Operations Manager 2007\Health Service State\Monitoring Host Temporary Files 5\4235\

One or more workflows were affected by this.

Workflow name: AlertUpdateConnector.ConnectorDiscovery
Instance name: SQL2V1.opsmgr.net
Instance ID: {F8AEF188-D663-9719-3FD8-94B2AF6F0726}
Management group: PROD1

Or:

Workflow Runtime: Failed to run a WMI query

Object enumeration failed

Query: 'SELECT DisplayName, Name, StartMode FROM Win32_Service WHERE Name="ClusSvc" and StartMode!="Disabled"'
HRESULT: 0x80041006
Details: Out of memory

One or more workflows were affected by this.

Workflow name: Microsoft.Windows.Cluster.Service.Discovery
Instance name: SQL2CLN1.opsmgr.net
Instance ID: {90476733-8FA9-1718-152C-932FF9AB9BC6}
Management group: PROD1

Or:

Workflow Runtime: Failed to run a WMI query

Object enumeration failed


Query: 'SELECT NumberOfProcessors FROM Win32_ComputerSystem WHERE DomainRole >1'
HRESULT: 0x800705af
Details: The paging file is too small for this operation to complete.

One or more workflows were affected by this.

Workflow name: System.Mom.BackwardCompatibility.Computer.Server.DiscoveryRule
Instance name: SQLDB1.opsmgr.net
Instance ID: {AF7C2749-FF52-E354-EEAE-8CFCA3541607}
Management group: PROD1


The details of the script or discovery or workflow are irrelevant.  What is relevant here is seeing the messages “No more threads can be created in the system” and Out of memory” and “The paging file is too small for this operation to complete”.

Those are tell-tale signs of a memory leak or memory pressure, and in this case caused by WMI.

 

Sure enough – when I check this system, I can easily see there is an issue:

 

image

 

If you are running Server 2008R2 on ANY monitored system, it is highly likely that you need to apply this hotfix. 

I recommend it across the board for all Windows 2008 R2 monitored agents, until Windows Server 2008R2 SP1 releases, or something supersedes this.

Comments
  • Great Job catching it! I dont have too many customers on R2 yet, but this one for the toolbox :)

  • Hi Kevin,

    How did you create the rule? I need to show the same graphs to confirm a=or not I have the same issue.

    I started

    Authoring Tab > Management Pack Objects > Rules > Create a new Rule > Collection Rules > Event Based > WMI Event (associated with a custom MP) > Rule Target "Windows Server 2008 R2 Computer Group"  > WMI Namespace >

    or

    Authoring Tab > Management Pack Objects > Rules > Create a new Rule > Collection Rules > Performance Based > WMI Event (associated with a custom MP) > Rule Target "Windows Server 2008 R2 Computer Group"  > WMI Namespace >

    or

    Authoring > Management Pack Templates > Windows Service

    Which way is the best ?

    Thanks,

    Dom

  • Dom - this is not a WMI event.  This is a simple perf counter.

    Process \ Private Bytes \ WmiPrvSE*

    Just use WmiPrvSE* for the instance to get them all.... in a perf collection rule.

  • Also - NEVER target a GROUP - that is OpsMgr 101.  You cannot target groups with any workflow.

    You need to target a non-singleton Class - such as "Windows Server Operating System" or "Windows Server 2008 Operating System"

  • Hello,

    or i could do also

    Authoring Tab > Management Pack Objects > Rules > Create a new Rule > Collection Rules > NT Event: Log

    Event ID: 1000

    Event Source: Application Error

    Event Level: Error

    But this ione seems too genereic isn't it?

    Thanks,

    Dom

  • Hi Kevin,

    Thanks let me change this and see how it goes...

    Dom

  • Hi Kevin,

    The rule (called: xxxx - WMI Leakage  on Windows Server 2008 R2) is in place "Disabled" per default overridden for the class "Windows Server 2008 R2 Operating System" to "Enabled = True". I have a group of  35 Servers with Windows Server 2008 R2 Operating System in the Windows 2008  R2 Computer Group. The whole Group Windows Server 2008 Computer Group has 106 servers.

    Rule Target: Windows Server 2008 Operating System

    I created a Performance View on data related to "Windows Server Operating System" for the Group Windows Server 2008 R2 Computer Group" with View Performance collected by ... when when browing for the rule i do not see it available from the list.... Under Authoring > Management Pack Objects > Rules it is available... should I change one parameter in the view... the "data related to" and/or the group?

    Don't forget to note the Rule and the View should be in the same Category, I used Performance Collection for both where I tried "Custom category earlier!!! (:

    Now it works...

    I will wait now several days of collection to get more data.

    Thanks,

    Dom

  • We have 28 servers upgraded to R2. After running the counter against all R2 servers for a day we can already see one candidate for the hotfix. The server's WMI service went from 15MB to 147MB of memory consuption. Thanks for the tip!

  • Is this only a issue for 2008 R2 - not 2008 ?

  • Peter - as far as I know - yes this is only an issue for Windows Server 2008 R2.

  • Hello Kevin

    I downloaded this hotfix but there only seems to be an x86 version of the host fix. We are using the x64 version of Windows 2008 R2

    Do you know if there is an x64 version of this hot fix or is it not required for x64 platforms

    Thanks in advance

  • All versions are vailable - you need to hit the "Show hotfixes for all platforms and languages" because by default this hotfix page will only display your detected OS.

    There is an x86 version for Win7.  Server 2008R2 is 64 bit only.

  • I see this exact same issue on my Exchange 2007 servers....the only thing is that they are not running R2, -only Windows2008 with SP2.

    From what I can tell, many of the WMI hotfixes are already covered by SP2, so I am going nuts trying to figure out what is causing the WMI errors in the event logs.

    I too am getting "paging file is too small" errors.  I'm also seeing "Not enough storage is available to complete this operation ".  I checked the WMI process though and memory usage was normal, so I don't understand why I am seeing these kinds of WMI errors if the WMI process is not consuming a large amt of memory...

  • I will copy my response from your post on Systemcentercentral.com:

    I am tracking a similar issue with another customer. There are lots of things that can cause leaks in WMI. The only hotfix I can find is for when perfmon is used remotely, which I dont think is related.

    We do see WMIPRVSE processes using over 100 MB (as high as 500MB) when this condition exists. Bouncing the server does resolve the issue, until the WMI processes get large again.

    There are many things that might be causing this , like HP/Dell hardware agents, other software agents that access and use WMI, and it could possibly be opsMgr related, but it seems to only happen on systems with LARGE amounts of RAM. I am still investigating and will post something when I find it - but it would be good to open a case with Microsoft from the Windows side, and have them investigate root cause of WMI apparently leaking memory.

  • Thx Kevin.  I will be contacting Microsoft support on this and will update the msg thread on SystemCenterCentral and on here to let everybody know what the findings are.

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
Search Blogs