Kevin Holman's System Center Blog

Posts in this blog are provided "AS IS" with no warranties, and confers no rights. Use of included script samples are subject to the terms specified in the Terms of UseAre you interested in having a dedicated engineer that will be your Mic

A new feature in R2-CU4 – reconnecting to SQL server after a SQL outage

A new feature in R2-CU4 – reconnecting to SQL server after a SQL outage

  • Comments 10
  • Likes

There is a new feature added to the R2 Cumulative Update 4 hotfix, which I recently wrote about HERE

This new feature enables the OpsMgr Management Server to tolerate a SQL outage hosting the OperationsManager database better in specific cases.  This feature is NOT enabled by default, by design.  In order to enable this feature you MUST have previously applied R2-CU4 or later to your RMS role.  You should only enable this feature, if you feel you have been impacted by this issue, and you find you have to restart your RMS services frequently to get things flowing again after a SQL connectivity outage.

 

Under typical situations, the Root Management Server reconnects to SQL pretty well, if the SQL server is unavailable for a short time.  This might happen if your SQL cluster is failed over (there is a short period where the SQL instance is unavailable during a failover) or when patching/rebooting a stand-alone (non-clustered) SQL server.

However – in larger environments, or when the SQL outage is extended beyond a short reboot/failover, we have seen where the RMS does not reconnect/recover successfully.  Subsequently, the RMS might start logging errors in the event log from the Health Service – including 2115 (bind) events, and 4506 (Data dropped) events.  Previously – this situation did not recover until the RMS OpsMgr services were restarted (and in some cases the HealthService on the Management server).

 

To enable this feature - On the RMS – create two new registry entries:

Under the “HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\DAL” key, create two new DWORD values, as below:

DALInitiateClearPool

DALInitiateClearPoolSeconds

DALInitiateClearPool should be set to Decimal value “1” to enable it.

DALInitiateClearPoolSeconds should be set to Decimal value “60” to represent 60 second retry interval.

 

Here is a screenshot:

image

 

This change will take effect after you restart the RMS HealthService (System Center Management Service).

Comments
  • Thanks!

  • Thanks Kevin, after reading the KB article it wasn't clear what type of registry values needed to be created.  Especially as the KB describes one of the values as 'DALInitiateClearPool = true' - which I though it was a bit ambiguous.  You have cleared it up though :)

  • Thanks for the info, as pointed out already - the kb article doesn't tell you that you have to use dword entries and that true is actually 1.

  • thanks for the great post

  • Does anyone know if this also works in SCOM 2012?

  • Is this still needed in 2012 R2 (SCSM & SCOM)?

  • @Maekee

    This configuration is specific to SCOM 2007 ONLY. These registry settings do not apply to SCOM 2012 and will have no effect. Apparently SCOM 2012 was designed to automatically retry connections with SQL after an outage.

  • Hi Kevin,
    No according to this KB (http://support.microsoft.com/kb/2913046/en-us), under Applies to you can find •Microsoft System Center 2012 Operations Manager Service Pack 1.

  • Ok, I got the final low-down on this.

    We moved the DAL registry location. The settings in this BLOG POST apply to SCOM 2007 ONLY. The settings in that above referenced KB ARTICLE are correct, still valid, and apply to SCSM and SCOM 2012, 2012SP1, and 2012R2. I will have a quick blog post on this.

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
Search Blogs