Mihai Sarbulescu's System Center Blog

let's play with System Center

Configuration Manager Connector's DCM Rule can cause massive performance issues in Service Manager

Configuration Manager Connector's DCM Rule can cause massive performance issues in Service Manager

  • Comments 1
  • Likes

IMPORTANT: Always perform a FULL Backup of the database before doing anything to it !!!

ALSO: It is advised that you open a case at Microsoft before doing this - directly editing the database is not supported and you may find yourself in an unsupported state if anything goes wrong.

 

There is a Rule which exists in the Configuration Manager Connector Management Pack which is called Incident_Desired_Configuration_Management_Custom_Rule.Update. This Rule can cause workflows (Subscription Rules) to lag behind a lot and cause the grooming jobs to fail, thus causing the EntityChangeLog table to get very large. In turn this causes in internal SQL Stored Procedure called p_EntityChangeLogSnapshot to take a lot of time to finish. This stored procedure is executed very often and while it is running, the performance of the Console is also impacted a lot.

This situation can occur if the Configuration Manager Connector will sync a *lot* of updates to the Microsoft.SystemCenter.ConfigurationManager.DCM_NonCompliance_CI instances. The Incident_Desired_Configuration_Management_Custom_Rule.Update Rule is configured to process these updates and this Rule is configured to run every 100 seconds and process *only* 30 update notifications (batch size is set to 30). So if there is a very large number of updates to process, this will cause this Rule to remain lagging behind because it will take too much time to catch up. This causes all other Rules to get stalled - but keep in mind - this Rule is processing, but it is processing very slow. As long as this Rule will not be able to finish processing the updates and lag behind, the grooming jobs (SQL Stored Procedure p_GroomSubscriptionSpecificECLRows) will not be able to clear the backlog (the update entries) from the EntityChangeLog table.

 

To generally check which rules are lagging behind and to particularly check if you are impacted by this issue, run this SQL Query on your ServiceManager database:

           NOTE: if you are using SM 2010, please delete this line from the following query:       AND W.IsPeriodicQueryEvent = 0

 DECLARE @MinState INT
SET @MinState = (
   SELECT MIN(State)
   FROM CmdbInstanceSubscriptionState AS W WITH(NOLOCK)
   JOIN Rules AS R
      ON W.RuleId = R.RuleId
   WHERE
      R.RuleEnabled <> 0 AND
      W.IsPeriodicQueryEvent = 0
)
SELECT
   R.RuleName,
   SM.TypeName AS SrcTypeName,
   RT.RelationshipTypeName AS RelTypeName,
   TM.TypeName AS TarTypeName,
   S.LastModified,
   O.OverrideName,
   R.RuleEnabled
FROM CmdbInstanceSubscriptionState AS S WITH(NOLOCK)
LEFT OUTER JOIN Rules AS R
   ON S.RuleId = R.RuleId
LEFT OUTER JOIN ManagedType AS SM
   ON S.TypeId = SM.ManagedTypeId
LEFT OUTER JOIN RelationshipType AS RT
   ON S.RelationshipTypeId = RT.RelationshipTypeId
LEFT OUTER JOIN ManagedType AS TM
   ON S.RelatedTypeId = TM.ManagedTypeId
LEFT OUTER JOIN ModuleOverride AS O
   ON O.ParentId = R.RuleId
WHERE
   S.State <= @MinState AND
   R.RuleEnabled <> 0 AND
   S.IsPeriodicQueryEvent = 0

 

Now if you see Incident_Desired_Configuration_Management_Custom_Rule.Update being present in the results, then you probably are impacted by this issue. To verify further lets check if we have a high number of updates for the Microsoft.SystemCenter.ConfigurationManager.DCM_NonCompliance_CI instances. To do this, run this SQL Query on the ServiceManager database:

 SELECT COUNT(*) AS 'Number of Updates to DCM Instances'
FROM EntityChangeLog AS ECL WITH(NOLOCK)
JOIN ManagedType AS MT
   ON ECL.EntityTypeId = MT.ManagedTypeId
WHERE MT.TypeName = 'Microsoft.SystemCenter.ConfigurationManager.DCM_NonCompliance_CI'

 

Ok so we have a very high number of updates to these instances? ... Well maybe these do get processed fast enough ... lets also check how many of these *still* need to get processed because they were not processed yet. To check this, run this SQL Query on the ServiceManager database:

           NOTE: if you are using SM 2010, please delete this line from the following query:       AND W.IsPeriodicQueryEvent = 0

 DECLARE @MinState INT
SET @MinState = (
   SELECT MIN(State)
   FROM CmdbInstanceSubscriptionState AS W WITH(NOLOCK)
   JOIN Rules AS R
      ON W.RuleId = R.RuleId
   WHERE
      R.RuleEnabled <> 0 AND
      W.IsPeriodicQueryEvent = 0
)
SELECT COUNT(*) AS 'UNPROCESSED Number of Updates to DCM Instances'
FROM EntityChangeLog AS ECL WITH(NOLOCK)
JOIN ManagedType AS MT
   ON ECL.EntityTypeId = MT.ManagedTypeId
WHERE
   ECL.EntityTransactionLogId >= @MinState AND
   MT.TypeName = 'Microsoft.SystemCenter.ConfigurationManager.DCM_NonCompliance_CI'

 

 Right so this is getting interesting, this is actually a *lot* of data (updates) coming in from the connector, so just out of curiosity, let us see how many updates to the DCM instances we actually get per day. To do this, run this SQL Query on the ServiceManager database:

           NOTE: if you are using SM 2010, please delete this line from the following query:       AND W.IsPeriodicQueryEvent = 0

 DECLARE @MinState INT
SET @MinState = (
   SELECT MIN(State)
   FROM CmdbInstanceSubscriptionState AS W WITH(NOLOCK)
   JOIN Rules AS R
      ON W.RuleId = R.RuleId
   WHERE
      R.RuleEnabled <> 0 AND
      W.IsPeriodicQueryEvent = 0
)
SELECT
   DATEPART(DAYOFYEAR, ECL.LastModified) AS 'Day of the Year',
   CASE ECL.ChangeType
      WHEN 0 THEN 'Add'
      WHEN 1 THEN 'Update'
      ELSE 'Other'
   END AS ChangeType,
   COUNT(*) AS InstanceCount
FROM EntityChangeLog AS ECL WITH(NOLOCK)
JOIN ManagedType AS MT
   ON ECL.EntityTypeId = MT.ManagedTypeId
WHERE
   ECL.EntityTransactionLogId >= @MinState AND
   MT.TypeName = 'Microsoft.SystemCenter.ConfigurationManager.DCM_NonCompliance_CI'
GROUP BY DATEPART(DAYOFYEAR, ECL.LastModified), ECL.ChangeType
ORDER BY DATEPART(DAYOFYEAR, ECL.LastModified), ECL.ChangeType

 

 Well then, it seems that we have a *lot* of DCM Instances updates per day ... hmm interesting - so now do we solve this issue? Well we would need to disable the rule by creating an override, clear the Rule entry from the CmdbInstanceSubscriptionState table and then run grooming manually for a couple of times until we see that it finishes really quick.

Keep in mind that when doing this, it *may* take a very large amount of times (couple of hours even) until grooming will complete its clean-up. This all depends on how many entries it actually needs to groom. When you run the grooming, this should be ideally done over night or weekend and not during working hours because this will impact Console performance a lot and the *best* thing here would be that we should have no Console open until grooming finishes.

 

So, to solve this, follow these steps as described below:

1. import the attached Management Pack (DisableDCMRuleMPFix.xml) into Service Manager which will only contain an override which will disable the DCM Rule (if you are on SM 2010, then you may need to modify it or it will not work - or try to create one which is similar for 2010)

 

2. run this SQL Query on the ServiceManager database to delete the DCM Rule entry from the CmdbInstanceSubscriptionState table:

 DELETE CmdbInstanceSubscriptionState
WHERE RuleId = (
   SELECT RuleId
   FROM Rules
   WHERE RuleName = 'Incident_Desired_Configuration_Management_Custom_Rule.Update'
)

 

 3. at this point, *NO* Console should be open, so please close all Consoles, now close all the System Center Service Manager services running on all the SM Servers and on the SM Data Warehouse Server and perform the next step only when you know that Service Manager is not being used by clients - so non-business hours would be best here 

 

 4. run this SQL Query on the ServiceManager database a couple of times until you see that it finished quickly - this will be the grooming stored procedure:

EXEC p_GroomSubscriptionSpecificECLRows '55270A70-AC47-C853-C617-236B0CFF9B4C', 0, N'', 10000

NOTE: the last parameter for this stored procedure is the batch size - how many rows will be deleted in one iteration - you can increase or decrease this according to the number of entries which you notice from the other queries above - the higher the number of batch size, the faster will we get this finished - but don't set a too high number

 

 5. now once you are done, don't forget to start all the System Center Service Manager services running on all the SM Servers and on the SM Data Warehouse Server 

 

 

Well then, now things should get going again in Service Manager! :D

By the way - these actions/queries can be used for other rules as well that could end up in this bad situation. To check if maybe you are impacted by something like this, then go through this article: http://blogs.technet.com/b/mihai/archive/2012/07/13/service-manager-slow-perfomance.aspx

 

 

Happy hunting!

 

Attachment: DisableDCMRuleMPFix.zip
  • Hello Mihai,

    thanks for sharing!

    UNPROCESSED Number of Updates to DCM Instances is in a large customer environment "0" so thats good :-)

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
Search
Blog - Link List

Official System Center Blogs

Need to read Blogs