Kevin Holman's System Center Blog

Posts in this blog are provided "AS IS" with no warranties, and confers no rights. Use of included script samples are subject to the terms specified in the Terms of UseAre you interested in having a dedicated engineer that will be your Mic

OpsMgr: Network utilization scripts in BaseOS MP version 6.0.6958.0 may cause high CPU utilization and service crashes on Server 2003

OpsMgr: Network utilization scripts in BaseOS MP version 6.0.6958.0 may cause high CPU utilization and service crashes on Server 2003

  • Comments 18
  • Likes

Recently I discussed some of the changes in the Base OS MP version 6.0.6958.0

OpsMgr- MP Update- New Base OS MP 6.0.6958.0 adds Cluster Shared Volume monitoring, BPA, new rep

 

One of the changes in this newer version of the MP is the addition of a new datasource module, which runs a script to output the Network Adapter Utilization.  The name of the datasource is “Microsoft.Windows.Server.2008.NetworkAdapter.BandwidthUsed.ModuleType”.   This datasource module uses the timed script property bag provider, along with a generic mapper condition detection.  The script name is:  “Microsoft.Windows.Server.NetwokAdapter.BandwidthUsed.ModuleType.vbs”

 

There are 3 rules, and 3 monitors for each OS (2003 and 2008), which utilize this datasource:

  • Rules:
    • 2008
      • Microsoft.Windows.Server.2008.NetworkAdapter.PercentBandwidthUsedReads.Collection (Percent Bandwidth Used Read)
      • Microsoft.Windows.Server.2008.NetworkAdapter.PercentBandwidthUsedWrites.Collection (Percent Bandwidth Used Write)
      • Microsoft.Windows.Server.2008.NetworkAdapter.PercentBandwidthUsedTotal.Collection (Percent Bandwidth Used Total)
    • 2003
      • Microsoft.Windows.Server.2003.NetworkAdapter.PercentBandwidthUsedReads.Collection (Percent Bandwidth Used Read)
      • Microsoft.Windows.Server.2003.NetworkAdapter.PercentBandwidthUsedWrites.Collection (Percent Bandwidth Used Write)
      • Microsoft.Windows.Server.2003.NetworkAdapter.PercentBandwidthUsedTotal.Collection (Percent Bandwidth Used Total)
  • Monitors:
    • 2008
      • Microsoft.Windows.Server.2008.NetworkAdapter.PercentBandwidthUsedReads (Percent Bandwidth Used Read)
      • Microsoft.Windows.Server.2008.NetworkAdapter.PercentBandwidthUsedWrites (Percent Bandwidth Used Write)
      • Microsoft.Windows.Server.2008.NetworkAdapter.PercentBandwidthUsedTotal (Percent Bandwidth Used Total)
    • 2003
      • Microsoft.Windows.Server.2003.NetworkAdapter.PercentBandwidthUsedReads (Percent Bandwidth Used Read)
      • Microsoft.Windows.Server.2003.NetworkAdapter.PercentBandwidthUsedWrites (Percent Bandwidth Used Write)
      • Microsoft.Windows.Server.2003.NetworkAdapter.PercentBandwidthUsedTotal (Percent Bandwidth Used Total)

 

Only the “Total” rules and monitors are enabled by default, the Read/Write workflows are disabled out of the box by design.

The good:

 

This new functionality is cool because it allows us to monitor the total utilization based on the network bandwidth as a percentage of the “total pipe”, report on this, and view the data in the console:

 

image

 

 

The issue:

 

Since there is no direct perfmon data to collect this, the information must be collected via script.  I wrote about how to write this yourself HERE.

There are 4 known issues with this script in the current Base OS MP, which can cause problems in some environments:

 

1.  When the script executes – it consumes a high amount of CPU (WMIPrvse.exe process) for a few seconds.

2.  The script does not support cookdown, so it runs a cscript.exe process and an instance of the script for EACH and every network adapter in your system (physical or virtual).  This makes the CPU consumption even higher, especially for systems with a large number of network adapters (such as Hyper-V servers).

3.  The script does not support teamed network adapters very well, as they are manufacturer/driver dependent, and are often missing the WMI classes expected by the script, so you will see errors on each script execution, about “invalid class”

4.  On some Windows 2003 servers, people have reported this script eventually causes a fault in netman.dll, and this can subsequently cause some additional critical services to fault/stop.

Event Type:        Error
Event Source:    Application Error
Event Category:                (100)
Event ID:              1000
Date:                     16/10/2011
Time:                     4:41:09 AM
User:                     N/A
Computer:          WSMSG7104C02
Description:
Faulting application svchost.exe, version 5.2.3790.3959, faulting module netman.dll, version 5.2.3790.3959, fault address 0x0000000000008d4f.

 

 

 

From a CPU perspective – below is an example Hyper-V server with multiple NIC’s.  I set the rule and monitor which use this script to run every 30 seconds for demonstration purposes (they run every 5 minutes by default).

image

 

You can see WMI (and the total CPU) spiking every 30 seconds.

After disabling all the rules and monitors which utilize this data source, we see the following from the same server:

image

 

 

Based on these issues, I’d probably recommend disabling these rules AND monitors for Windows 2003 and Windows 2008.  They seem to create a bit more impact than the usefulness of the data they provide.

 

 

To disable these monitor and rules:

 

Open the Authoring pane of the console.

Highlight “Monitors” in the left pane.

 

In the top line – click “Scope” until you see the “Scope Management Pack Object” pop up:

image

 

In the Look For box – type “Network”:

 

image

 

Tick the boxes next to “Windows Server 2003 Network Adapter” and “Windows Server 2008 Network Adapter” and click OK.

 

image

 

Now you will see a scoped view of only the monitors that target the windows server network adapter classes.  Expand Windows Server 2003 Network Adapter > Entity Health > Performance:

image

 

You can see that Read and Write monitors are already disabled out of the box.  You need to add a new override to disable the “Total” monitor.  Set enabled = false and save it to your Base OS Override MP for Windows 2003.

 

Now, repeat this for the Server 2008 monitor for “Percent Bandwidth Used Total”.

 

After disabling the two monitors that run this script – we also need to disable the rules that also share this script.  Highlight Rules in the left pane.

Again – the read/write rules are disabled out of the box, so you need to create two overrides for each rule, one for Server 2003 Percent Bandwidth Used Total, and then the same that targets Server 2008:

 

image

Comments
  • Thx very much Kevin.

    Why do u have to disable Rules AND Monitors? (Why not just the Rules?)

    Cheers,

    John Bradshaw

  • Kevin, saw your comment on the other blog. Greatly appreciate you looking in to this. I will keep the 2003 rules/monitors disabled and also disable the ones for 2008 as well so we hopefully alleviate the issue on those servers, even though I haven't seen it on any 2008 servers yet.

    I've got a ticket open at the moment in regards to this issue so I will point them towards this post and see what they say.

  • @JB - because BOTH rules and monitors run the script.  If you dont turn everything off, the script will run and the potential impact will occur.

    @Gary - feel free to come back and comment on your ticket outcome, or shoot me the details in email.  I'm interested.

  • Kevin, great article and very timely......those rules were definitely causing problems for a lot of my servers....especially on our Citrix servers.....I'd be interested to know what Microsoft plans to do to fix this....

  • We too are seeing similar impact on our servers with the associated service crashes. I have had a case open for some time with extra debugging on the netman.dll but to no avail - each server we have the debugging on we see no more crashes :(

    The only solution for us will be to disable the rules and monitors. It absolutely looks like the SCOM agents are exposing a netman.dll bug but it's something we can't nail down here. I'll be interested if anyone else ever gets a real solution, other than disabling monitoring.

  • Hey Kevin!

    Is this problem still under investigation and is there a timeline for a fix? As Warren mentioned disabling monitoring is not a real solution.

    Best Regards

  • Peter -  it is being evaluated for the next Base OS MP, but I dont have any timeline details.  I disagree with you, however - disabling this SPECIFIC monitor and rules IS a valid workaround - as this is a new monitor that just showed up in this version, it was not present in any previous versions, so there was no customer dependencies on this monitoring.  It was simply a value add monitor.... as are many additions to the base OS MP or any other MP over time.  Keep in mind sometimes these issues require hotfixes for Windows as they arent always an OpsMgr issue - sometimes they are a core WMI namespace issue and the fix is up to the Windows team.  I dont know for sure in this case.

  • Hi

    we have still the same problem. can anyone explain how to disable the script since i am new in SCOM!

    Thanks

  • Hi Kevin,

    Is this issue still exist in BaseOS MP version 6.0.6972.0? (SCOM 2012)

    Best Regards,

    Gemmy

  • @GemmySit -

    No.  This is fixed in 6.0.6972.0 and is stated in the MP guide for the new MP and here:  blogs.technet.com/.../opsmgr-mpupdate-new-base-os-mp-6-0-6972-0-adds-new-cluster-disks-changes-free-space-monitoring-other-fixes.aspx

  • Kevin, apparently this monitor/rule is still causing issues with NIC Teaming and complaining about invalid VMI class: I went ahead and disabled these Monitors/Rules anyway, but was wondering based on your last comment if in fact this was/is fixed in the latest base OS MP?

    Log Name:      Operations Manager

    Source:        Health Service Script

    Date:          10/9/2012 9:47:01 AM

    Event ID:      4001

    Task Category: None

    Level:         Error

    Keywords:      Classic

    User:          N/A

    Computer:      WEB06.prod.local

    Description:

    Microsoft.Windows.Server.NetwokAdapter.BandwidthUsed.ModuleType.vbs : The class name 'Win32_PerfFormattedData_Tcpip_NetworkInterface Where Name ='Internal:HP Network Team _1'' did not return any valid instances.  Please check to see if this is a valid WMI class name.. Invalid class

    Event Xml:

    <Event xmlns="schemas.microsoft.com/.../event">

     <System>

       <Provider Name="Health Service Script" />

       <EventID Qualifiers="0">4001</EventID>

       <Level>2</Level>

       <Task>0</Task>

       <Keywords>0x80000000000000</Keywords>

       <TimeCreated SystemTime="2012-10-09T14:47:01.000000000Z" />

       <EventRecordID>55675</EventRecordID>

       <Channel>Operations Manager</Channel>

       <Computer>WEB06.prod.local</Computer>

       <Security />

     </System>

     <EventData>

       <Data>Microsoft.Windows.Server.NetwokAdapter.BandwidthUsed.ModuleType.vbs</Data>

       <Data>The class name 'Win32_PerfFormattedData_Tcpip_NetworkInterface Where Name ='Internal:HP Network Team _1'' did not return any valid instances.  Please check to see if this is a valid WMI class name.. Invalid class </Data>

     </EventData>

    </Event>

  • The High CPU issue was fixed in a recent Base OS MP for Server 2008 and 2008R2.  We still recommend leaving it disabled for Windows Server 2003.

    The script error you are seeing is something different.  This has always been a challenge gathering perfmon data for teamed NIC's because the teamed NIC is a virtual NIC and the driver instantiates the data into perfmon and WMI - and it may or may not match our discovered properties.  Therefore - if you see these script failures for teamed NIC's - you should disable it for all teamed NIC's.  You will notice that MOST perf data collections has always failed in SCOM for teamed NIC's.... we need to change how we discover teamed NIC's or get HP, Broadcomm etc... to use a standard.  Microsoft now offers NIC teaming as part of the OS in Windows Server 2012 - so I imagine we will get that one right.  :-)

  • Hello,

    Is it still current to disable the rules for Windows Server 2003 with OS MP 6.0.7061.

    Any other updates? SCOM 2007 so far

    Thanks,

    Dom

  • bump

    currently have both rules and monitors disabled for 2003, 2008 and 2012 adapters and still getting errors on teamed nics with mp 7061. Is there any new rule/monitor in place in this version that also runs this silly script?

  • @Steve G: What version of the OS Management pack are you running? I will check.

    What specific errors are you getting?

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
Search Blogs