Welcome to TechNet Blogs Sign in | Join | Help

Are your agents restarting every 10 minutes? Are you sure?

**Updated 12-21-2009

This post is OLD, and the way this process works has changed.

Please see the updated post at:

http://blogs.technet.com/kevinholman/archive/2009/12/21/the-new-and-improved-guide-on-healthservice-restarts-aka-agents-bouncing-their-own-healthservice.aspx

Published Thursday, March 26, 2009 7:50 PM by kevinhol
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# re: Are your agents restarting every 10 minutes? Are you sure?

Wednesday, April 22, 2009 1:09 PM by Layne

Kevin, any idea why when trying to override the Health Service Private Bytes Threshold monitor you cannot override it for a group of computer objects that you've created if you choose override "for a group"?  The only groups that appear in the list to choose from are groups that are created by management packs, etc., not any user created groups.  In order to override for a group you've created you have to choose override  "for all objects of another type", view all targets, and then choose the group you've created.

Conversly, the rule for Monitoring Host Private Bytes Threshold will allow you to override for a group that you've created if you choose "override for a group".

Great articles, keep up the good work.

# re: where is my group?

Wednesday, April 22, 2009 1:27 PM by kevinhol

This is a bug in SP1 - when you try and do this from the Authoring pane.

Open Health Explorer - create the override there, and you will see your group.  :-(

# re: Are your agents restarting every 10 minutes? Are you sure?

Sunday, May 24, 2009 9:57 PM by DilipManchala

Hi Kevin, As suggested by you in the above blog I have made changes Health Service Private bytes to 200MB for a particular SCOM agent. But I still see my agent not getting restarted.

I am running nworks application on this particular computer and also see my healthservicestore.edb size growing 220 MB.Is this normal??

I also see the below errors in the event log on the same agent machine.

Event ID:4506

Event Source:HealthService

Description:

Data was dropped due to too much outstanding data in rule "many" running for instance "many" with id:"many" in management group "XXXXXX".

# re: Nworks

Tuesday, May 26, 2009 8:56 AM by kevinhol

The Nworks MP causes an agent to act as a proxy - and load workflows and collect data for potentially a HUGE number of machines.  Therefore it is common that this agent will need more memory for this.

This is documented in the Nworks documentation I believe.

I would disable these monitors for those instances - and measure how much they *consume* and how fast they consume it - and stop bouncing them.

If you must set a value - I would start at 600-800MB for privatebytes.... and monitor the consumption closely.

# re: Are your agents restarting every 10 minutes? Are you sure?

Wednesday, May 27, 2009 3:16 PM by JHBoricua

I've created the custom rules and the custom view per your post, but the 'Source' column on my alerts show "Microsoft(R) Windows(R) Server 2003, Enterprise Edition" rather than the actual server name as shown on your screenshot. How did you get it to show the server name in the Source column of your custom view?

# re: servername in the view?

Wednesday, May 27, 2009 4:08 PM by kevinhol

You need to personalize the view - and add "Path" next to "source".

This is true for any alert view - depending on the target class of an alert, the FQDN will either be in Source or Path.... not always both and not consistent.

# re: Are your agents restarting every 10 minutes? Are you sure?

Tuesday, June 09, 2009 6:39 AM by snajgel

When working with multihomed agents you need to do the override in the other management group as well.

# re: Multi-homed agents

Tuesday, June 09, 2009 9:44 AM by kevinhol

Yep - I just ran into that yesterday with a customer.  He had some restarting all the time - because they were multi-homed with his pre-prod management group.

# re: Are your agents restarting every 10 minutes? Are you sure?

Tuesday, June 09, 2009 3:55 PM by snajgel

Do you think it is becouse they are multi-homed they restart?

# re: are the restarting because they are multi-homed?

Tuesday, June 09, 2009 4:39 PM by kevinhol

No - they are restarting because BOTH management groups are monitoring the SAME healthservice process... and the lower value from either MG will bounce the service.

Overrides will need to be kept in synch for each management group - for this.

# re: R2

Tuesday, June 09, 2009 4:40 PM by kevinhol

By the way - this has changed quite a bit in R2 - when I have some time - I am going to document how that works.... and how it is different than SP1.

# re: Are your agents restarting every 10 minutes? Are you sure?

Wednesday, October 21, 2009 6:57 PM by Dominique

The Agents do not restart but I have "MonitoringHost.exe Handle Count Threshold Alert Message" how could I measure the actual value used so I could override with something closed to it.

Thanks,

Dom

# re: Are your agents restarting every 10 minutes? Are you sure?

Thursday, December 17, 2009 1:14 PM by mood76

Hi Kevin,

I have seen this issue reported in many newsgroup but have not be able to receive a satisfactory answer.

Using your example above, I have created a Group in a new unsealed MP, say 'NewMP'

Then I go to the monitor for Private Bytes threshold for the health service (which is stored in the DefaultMP)

I try to Override this Monitor for a Group and once the Group List comes up, I dont see the Group I created earlier.

In fact, even if I create a group and store that in the default MP, I am still not able to view it when I try to override that monitor.

Can you shed some light into this?

Thanks,

Mahmood

# re: Are your agents restarting every 10 minutes? Are you sure?

Thursday, December 17, 2009 10:52 PM by kevinhol

@ Mahmood:

What you are describing is a known SP1 bug - when you override a monitor in the authoring pane of the UI - custom groups are not displayed.

Simply go to discovered inventory - target HealthService - and open health explorer.  From here - you can create the overrides for your groups.

NOTE - there is a new update for SP1 that YOU MUST apply - which is n MP update which sets these monitors to 300MB by default - up from 100MB - which resolves 95% of the restart problems.  Only a handful of machines will need more than that, such as very large 64bit Exchange and SQL servers, possiby DNS and DHCP roles.

Leave a Comment

(required) 
required 
(required) 

  
Enter Code Here: Required
 
Page view tracker