Kevin Holman's System Center Blog

Posts in this blog are provided "AS IS" with no warranties, and confers no rights. Use of included script samples are subject to the terms specified in the Terms of UseAre you interested in having a dedicated engineer that will be your Mic

Do you randomly see a MonitoringHost.exe process consuming lots of CPU?

Do you randomly see a MonitoringHost.exe process consuming lots of CPU?

  • Comments 33
  • Likes

Randomly, you might see a single MonitoringHost.exe process on an agent, consuming 100% CPU. (Or 50%, or 25% depending on how many cores you have).  This process will stay at this level, and will not recover.  If you restart the OpsMgr HealthService, the problem goes away, and might not return for days or even weeks.

 

This particular symptom, might be due to an XML spinlock issue… this is a core Windows OS issue, and there is a hotfix available, which I have on my HOTFIX LINK

 

The KB is 968967 :

“The CPU usage of an application or a service that uses MSXML 6.0 to handle XML requests reaches 100% in Windows Server 2008, Windows Vista, Windows XP Service Pack 3, or other systems that have MSXML 6.0 installed”

I have seen that most customers are affected by this issue from time to time.  I have seen it very commonly in my lab, on Server 2008 Domain controllers, and my Server 2008 Hyper-V hosts…

 

 

A note on patching Server 2008:

 

When you go to download this hotfix for a server 2008 machine – it is very misleading on which hotfix to even get.  Here is the list of all available fixes:

 

image

 

For patching Server 2008 – you need to download the “Windows Vista” hotfix – in either x86 or x64, depending on your OS version:

 

image

 

 

 

Monitoring for this condition:

You can easily write a threshold monitor targeting agent or HealthService, to track the monitoringhost process \ %processor time threshold, and set it to alert when it has multiple consecutive samples above a defined threshold.

 

Here is an example of creating this monitor:

Authoring Pane > Monitors > New Unit Monitor > Windows Performance Counters > Static Thresholds > Single Threshold > Consecutive Samples over Threshold.

 

image

 

Give it a custom name that follows your documented custom Monitor naming standard, target “Health Service”, and put this under Performance rollup.

 

image

 

Hit the “Select” button (in SP1 – select “Browse”)  In the perf counter picker – choose a server with an installed agent, choose the Object “Process” the counter “%Processor Time” and the Instance “MonitoringHost”, and click OK.

 

image

 

Since there are multiple MonitoringHost processes… we will add a Wildcard to the Instance name in the monitor…. this will monitor ANY MonitoringHost process for high CPU.  Set the Interval to every 1 minute.

 

image

 

For the number of consecutive samples, and threshold… that is up to you.  For me – I will say that if I detect a single MonitoringHost process using more than 50% CPU, over all 5 consecutive samples (5 minutes) then I consider that bad:

 

image

image

 

image

 

At this point…. you can simply alert on the condition, or event try and add a recovery script – that will bounce the health service.  Generally, bouncing the HealthService when one of the processes is using all the CPU is not always 100% reliable… especially from a “NET STOP & NET START” type command.  I have found it more reliable to just kill the MonitoringHost process in this condition, and allow it to respawn…. but your mileage may vary.

http://blogs.technet.com/kevinholman/archive/2008/03/26/using-a-recovery-in-opsmgr-basic.aspx

Comments
  • I only had kept 4 rules and 12 monitors mentioned below in enabled state still facing this issue.. Never mind will follow steps as u suggested and share u the result.. Thanks a lot for your help :).. Rules: Performance Measuring: Print Queue\Jobs Performance Measuring: Print Queue\Jobs Spooling Performance Measuring: Print Queue\Total Jobs Printed Performance Measuring: Print Queue\Total Pages Printed Monitors: LPD Service: Service Status Monitor Print Spooler: Check Windows resources Print Spooler: Check Windows resources (Windows Server 2008 R2) Print Spooler: The print spooler failed to complete a task Print Spooler: The print spooler failed to complete a task (Windows Server 2008 R2) Print Spooler: Restart the Print Spooler service Print Spooler: Restart the Print Spooler service (Windows Server 2008 R2) Print Spooler: Restart the server or troubleshoot hardware problems Print Spooler: Restart the server or troubleshoot hardware problems (Windows Server 2008 R2) Print Spooler: Service Status Monitor Print Server 2008 Queue Job Errors Print Server 2008 Queue Not Ready Errors

  • Hi Kevin, we have installed SCOM 2012 ur2 version. We pushed the agent through SCOM console to client servers. In Domain Controller servers that are windows server 2012 standard , Monitoringhost.exe consumes high CPU. Does kb/974051 applicable for this issue and if so, what is the hotfix we need to install in Windows server 2012 Domain controllers?

  • Can we get patch for windows server 2008 r2 servers for the same issue

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
Search Blogs