Error alerts from the DNS MP – script failures, WMI Probe failed?
Updated 11/16/09 – I think this is pretty much resolve now – read below:
I have seen this at several customer sites, and even in my own lab. You might find the following alerts (below) stemming from the DNS MP.
To start, I would recommend the resolutions in my previous post: Getting lots of Script Failed To Run alerts- WMI Probe Failed Execution- Backward Compatibility
Everything at that post above helps, however, it does not resolve all of the alerts, 100% of the time. After about two weeks on a Windows 2003 DC/DNS server… the problem can re-occur with WMI failures and script errors. Restarting the computer, or restarting WMI will immediately resolve it.
This appears to be an issue with the Windows DNS WMI provider, that causes this Generic Failure when trying to access the WMI based DNS namespace, and query it. It appears that there is a TLS slot leak every time the DNS WMI provider unloads. It appears that the DNS WMI provider will unload after 5 minutes of not being accessed. Those who patch their computers monthly, likely wont even see this issue, or only see it for a short time until the next patch cycle.
To resolve it – I have written a monitor (example and sample MP below) which queries the DNS WMI namespace every 4 minutes, which keeps the provider from unloading. Therefore, the DNS provider stays loaded, and never has to unload, and leak a TLS slot. This has actually shown to resolve some other issues with scripts and latency, caused by the DNS WMI provider having to load back up after an unload.
The events/alerts you may see to define the error condition:
WMI Probe Module Failed Execution
Log Name: Operations Manager
Source: Health Service Modules
Event Number: 10409
Description:
Object enumeration failed
Query: 'Select EventLogLevel from MicrosoftDNS_Server'
HRESULT: 0x80041001
Details: Generic failure
One or more workflows were affected by this.
Workflow name: Microsoft.Windows.DNSServer.2003.Monitor.ServerLoggingLevel
Instance name: dc01.opsmgr.net
Instance ID: {11056C4C-B933-98ED-3DC5-4B9AAE232B23}
Management group: PROD1
WMI Probe Module Failed Execution
Log Name: Operations Manager
Source: Health Service Modules
Event Number: 10409
Description:
Object enumeration failed
Query: 'Select Name, Shutdown, Paused from MicrosoftDNS_Zone'
HRESULT: 0x80041001
Details: Generic failure
One or more workflows were affected by this.
Workflow name: Microsoft.Windows.DNSServer.2003.Monitor.ZoneRunning
Instance name: test.opsmgr.net (dc01.opsmgr.net)
Instance ID: {E0A3BD98-04B7-0C44-B26D-F8E6175456D1}
Management group: PROD1
Script or Executable Failed to run
Log Name: Operations Manager
Source: Health Service Modules
Event Number: 21406
Description:
The process started at 6:26:59 AM failed to create System.Discovery.Data. Errors found in output:
C:\Program Files\System Center Operations Manager 2007\Health Service State\Monitoring Host Temporary Files 10\8675\DNS2003ComponentDiscovery.vbs(123, 9) SWbemServicesEx: Generic failure
Command executed: "C:\WINDOWS\system32\cscript.exe" /nologo "DNS2003ComponentDiscovery.vbs" {C984657D-0255-F11B-2C76-1542793A684D} {11056C4C-B933-98ED-3DC5-4B9AAE232B23} dc01.opsmgr.net true true true "" false 700 1 Working Directory: C:\Program Files\System Center Operations Manager 2007\Health Service State\Monitoring Host Temporary Files 10\8675\
One or more workflows were affected by this.
Workflow name: Microsoft.Windows.DNSServer.2003.Discovery.Components
Instance name: dc01.opsmgr.net
Instance ID: {11056C4C-B933-98ED-3DC5-4B9AAE232B23}
Management group: PROD1
Script or Executable Failed to run
Log Name: Operations Manager
Source: Health Service Modules
Event Number: 21405
Description:
The process started at 3:58:21 AM failed to create System.Discovery.Data, no errors detected in the output. The process exited with 0
Command executed: "C:\WINDOWS\system32\cscript.exe" /nologo "DNS2003Discovery.vbs" {C8655A28-E27E-C6ED-B158-8569219A71A6} {89AC2E61-9144-4B94-9028-5A25F547213E} dc01.opsmgr.net false
Working Directory: C:\Program Files\System Center Operations Manager 2007\Health Service State\Monitoring Host Temporary Files 10\8515\
One or more workflows were affected by this.
Workflow name: Microsoft.Windows.DNSServer.2003.ServerDiscovery
Instance name: dc01.opsmgr.net
Instance ID: {89AC2E61-9144-4B94-9028-5A25F547213E}
Management group: PROD1
Script or Executable Failed to run
Event Type: Error
Event Source: Health Service Script
Event Category: None
Event ID: 1152
Date: 5/19/2009
Time: 11:18:48 AM
User: N/A
Computer: DC01
Description:
DNS2003Discovery.vbs : The Query 'select * from MicrosoftDNS_Server' did not return any valid instances.
Please check to see if this is a valid WMI Query.. Generic failure
So…. at this point, you have updated Cscript to 5.7 KB955360, and applied the KB933061 hotfix to stabilize WMI. However, after a period of time – these errors start happening again?
Since the issue is a problem caused by the Windows DNS WMI provider unloading – we need to keep it loaded. Since I believe it unloads after 5 minutes of inactivity, we need to make sure we query WMI at least every 4 minutes. The simplest, cheapest, and easiest way I know to do that… is to create a simple performance monitor, that queries the DNS WMI namespace for a value, every 4 minutes. I have a complete write-up on how to create this monitor at THIS LINK.
I will start by creating a new Management pack - “Custom – DNS Addendum MP”
Next – I will create a new monitor, Unit Monitor, WMI Performance Counters, Static Thresholds, Single Threshold, Simple Threshold.
Give the monitor a name. I used “Custom - DNS Monitor Query to keep namespace loaded”
For the monitor target – since this is a problem only on Windows Server 2003, I chose “DNS 2003 Server”. We do not need to do this on Server 2008.
For the Parent monitor, I chose performance:
Next, we need to fill in the namespace, query, and frequency. I input “root\MicrosoftDNS” for the namespace, and “Select EventLogLevel from MicrosoftDNS_Server”. Since I want it to run every 4 minutes, that would be 240 seconds:
For the performance mapper section – this is the most confusing – I explain it a bit deeper at THIS LINK For now – just follow the graphic below:
Next, on the Threshold page… since this monitor is not really supposed to do anything other than query WMI on a schedule… we don't want it to alert. The query we are running for this example will return an integer from 0-10, so I will set this to 99, a number it could never return so the monitor will never change state.
Next, on the Alert Settings, do NOT generate alerts for this monitor.
Click Create. That is it.
For those who want to test this – I am attaching my sample management pack with only this monitor in it. To use my MP, you will need to have SCOM R2, otherwise you can create your own monitor as above.