Was working with a customer on this one – figured it might help others.
Saw a lot of these VERY SPECIFIC 29106 events on the RMS, specifically with the text:
System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection.
Here is the full event:
Event Type: Warning
Event Source: OpsMgr Config Service
Event Category: None
Event ID: 29106
Date: 11/10/2009
Time: 12:43:24 PM
User: N/A
Computer: AGENTNAME
Description:
The request to synchronize state for OpsMgr Health Service identified by "3688d65d-a16c-2be6-7e84-5faf8a9cffe0" failed due to the following exception "System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index
What we found was – that we could look up these health service ID’s – by pasting them in the following SQL query:
select * from MTV_HealthService
where BaseManagedEntityId = '3688d65d-a16c-2be6-7e84-5faf8a9cffe0'
This would give us the name of the agent.
In the console, under Agent Managed – we found all of these agents were in “Unmonitored” state – on the agents themselves, they were stuck. They looked like they got installed, but could not get config. We deleted them from agent managed, waited a few minutes, and let them show back up in Pending Management. Approved them – then they were able to come back in and work properly. These looked for the most part like orphaned machines, and several were computers that were renamed, or old DC’s that were demoted.