Kevin Holman's System Center Blog

Posts in this blog are provided "AS IS" with no warranties, and confers no rights. Use of included script samples are subject to the terms specified in the Terms of UseAre you interested in having a dedicated engineer that will be your Mic

Clustering the Exchange 2010 Correlation Engine service

Clustering the Exchange 2010 Correlation Engine service

  • Comments 6
  • Likes

In the Exchange 2010 MP, there is a unique service installed which handles monitor state correlation, designed to reduce the number of alerts coming from this MP and enhancing root cause analysis of the core issue.

The Exchange MP guide states the following:

Determine which server will host the Correlation Engine. While not strictly required, it is highly recommended that the Correlation Engine service is installed on the Operations Manager Root Management Server (RMS).

I think the primary reason for this, is because this service, much like a product connector, connects to the RMS SDK to do its work.  Being on the RMS machine makes these SDK connections very efficient.  Like the guide states – it is not required, and could be installed on another server (like a management server) with VERY low network latency to the RMS.

However – if you have a clustered RMS, you likely chose that path for the absolute minimal downtime for scheduled maintenance.  Since getting Exchange 2010 alerts depends on the Correlation Engine being available, it makes sense to give consideration to adding this service to your clustered RMS.

 

This article is designed to cover the scenario where your RMS is clustered, and you consider the Correlation Engine for Exchange 2010 a critical service and desire to cluster it as well.  This is not official documentation on the process, just a method I use that seems to work well.

 

Follow the MP Guide, and when you reach the section on installing the Correlation Engine, use these notes.

 

1.  Install the Correlation Engine:  Log on to Node1 of the RMS cluster.  Choose to run ONE of the two files, based on if your RMS cluster is x86 or x64:

  • Exchange2010ManagementPackForOpsMgr2007-EN-i386.msi
  • Exchange2010ManagementPackForOpsMgr2007-EN-x64.msi

***Note – the MSI to install the Correlation Engine requires .NET 3.5. Please ensure this is installed (or a later version) prior to beginning your installation of the Correlation Engine (CE).  .NET installations can take a considerable amount of time.

This MSI will do two things – it will INSTALL the correlation service, and will extract the Exchange 2010 management pack.  Choose a LOCAL disk for each, as appropriate.  There is no requirement to place the correlation engine on a shared cluster disk.  Install the MSI similar to below:

image

 

2.  Configure the service:  As soon as this is complete – we need to stop the Microsoft Exchange Monitoring Correlation service, and set this service from Automatic, to Manual.

image

 

3.  Edit the config file:  Next – we need to edit the service configuration file to point to the correct RMS name.  This needs to be done anytime the correlation engine is installed on a cluster, or if it is installed on a NON-RMS server.  Browse to the path where you installed the service, such as C:\Program Files\Microsoft\Exchange Server\V14\Bin\ and open the Microsoft.Exchange.Monitoring.CorrelationEngine.exe.config file in Notepad. 

Find the line with <add key="OpsMgrRootManagementServer" value="localhost" /> and change “localhost” to the name of your virtual RMS name, as in the following example:

image

Save your changes to this file.

 

4.  Repeat the above steps on the second node of the cluster:  Now, with the service stopped and set to manual on Node1, we need to repeat steps 1-3 on Node 2.  We will install the correct MSI based on the OS version (x86 or x64) to the local disk on Node2, we will stop and configure the service to Manual startup, and we will edit the config file on Node2 for the correct RMS virtual name.

 

5.  Configure the clustered resource:  The next steps will differ slightly, depending on if your RMS cluster is based on Windows 2003 or Windows 2008.  I will demonstrate both below: 

For Windows 2003 Clusters:

Open Cluster Administrator, select your RMS cluster resource group.  In this group – you should already have resources for the:

  • Cluster Disk for the RMS
  • IP Address
  • Network Name
  • Config Service
  • Health Service
  • SDK Service

image

Right click your RMS cluster resource group, and choose New, Resource.

Give your new resource a name, such as “Exchange 2010 Correlation Engine”

In the Resource Type – choose Generic Service.

Make sure your group is the RMS Virtual cluster resource group, and hit Next.

image

 

Under possible owners, ensure both nodes are possible owners, and hit Next.

For dependencies – we need to take a dependency on the existing SDK service, because the correlation engine cannot run if the SDK service is not running.  Configure the dependency:

image

 

For the service name – type in EXACTLY:  MSExchangeMonitoringCorrelation

image

 

Complete the configuration of the generic service, accepting defaults on the remaining screens.

Bring the new service online on Node 1.

Open Services.msc and examine the Microsoft Exchange Monitoring Correlation service.  It should be in a Started status.

Test a failover of the RMS cluster to the second node.  You should see the Microsoft Exchange Monitoring Correlation service stop on Node 1, and start on Node 2.

 

For Windows 2008 Clusters:

Similar steps to above.  Open Failover Cluster Manager.  Right click your Virtual RMS cluster resource group under Services and Applications.  Add a resource > Generic Service.

image

 

In 2008, simply choose the Correlation Engine service from the list, and accept defaults on remaining screens.

Open the properties of your new service resource, and set the dependency on the existing SDK service.

Test the failover similar to the Windows 2003 instructions.

 

 

6.  Import Management Pack:  At this point, if you have not already done so – you should Import the Microsoft Exchange 2010 management packs.  The Correlation Engine (CE) will complain in the application log every 30 seconds if it connects to a SDK for a Management Group that does not have the MP imported.

 

 

Complete - you have successfully configured the Correlation Engine into the cluster, in 6 easy steps.

 

 

 

Troubleshooting:

The Correlation Engine has some pretty good logging, to the application log.  Below – are some typical events that the connector can raise, and a brief description:

 

Event Type:    Information
Event Source:    MSExchangeMonitoringCorrelation
Event Category:    General
Event ID:    700
Date:        10/15/2010
Time:        8:28:13 AM
User:        N/A
Computer:    RMSCLN1
Description:
Service starting.

Event Type:    Information
Event Source:    MSExchangeMonitoringCorrelation
Event Category:    General
Event ID:    703
Date:        10/15/2010
Time:        8:30:49 AM
User:        N/A
Computer:    RMSCLN1
Description:
Service stopped successfully.

The two above are pretty self explanatory.

Event Type:    Error
Event Source:    MSExchangeMonitoringCorrelation
Event Category:    General
Event ID:    714
Date:        10/15/2010
Time:        8:28:18 AM
User:        N/A
Computer:    RMSCLN1
Description:
Cannot connect to Operations Manager Root Management Server.

Error message: The Management Pack with the ID 'Microsoft.Exchange.2010' was not found on Operations Manager Root Management Server 'localhost'.

The above means that the Correlation service was not able to connect to the RMS SDK service, or the Exchange 2010 MP has not been imported yet.

Event Type:    Warning
Event Source:    MSExchangeMonitoringCorrelation
Event Category:    General
Event ID:    711
Date:        10/15/2010
Time:        8:28:18 AM
User:        N/A
Computer:    RMSCLN1
Description:
Unable to connect to Operations Manager Root Management Server 'localhost' using Management Pack with ID 'Microsoft.Exchange.2010' during service startup.  The service will continue to start and retry the connection periodically.

The above means the SDK service is Not running, the RMS is not contactable, or the Correlation service was not able to connect to the RMS SDK service, or the Exchange 2010 MP has not been imported yet.

 

Event Type:    Warning
Event Source:    MSExchangeMonitoringCorrelation
Event Category:    General
Event ID:    717
Date:        10/15/2010
Time:        8:28:19 AM
User:        N/A
Computer:    RMSCLN1
Description:
Connection with the Operations Manager Root Management Server has failed.

Error message: The Exchange Monitoring Correlation service cannot connect to the Operations Manager Root Management Server SDK Service. Exchange alerts will not be raised.

Number of occurrence: 1

Retrying in 30 seconds...

The above means the SDK service is Not running, the RMS is not contactable, or the Correlation service was not able to connect to the RMS SDK service, or the Exchange 2010 MP has not been imported yet.

 

Event Type:    Error
Event Source:    MSExchangeMonitoringCorrelation
Event Category:    General
Event ID:    714
Date:        10/15/2010
Time:        8:40:50 AM
User:        N/A
Computer:    RMSCLN1
Description:
Cannot connect to Operations Manager Root Management Server.

Error message: Couldn't connect to the Operations Manager Root Management Server 'blah'. Check the server name and network connection.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

The above means correlation engine could not resolve or contact the RMS computer.  Check that the RMS is resolvable, pingable, and the correlation config file has the correct RMS name.

Event Type:    Warning
Event Source:    MSExchangeMonitoringCorrelation
Event Category:    General
Event ID:    717
Date:        10/15/2010
Time:        9:27:06 AM
User:        N/A
Computer:    RMSCLN1
Description:
Connection with the Operations Manager Root Management Server has failed.

Error message: A Management Pack in the Management Group has been added, upgraded, overridden, or deleted.  A reconnection will be needed to detect the changes.

Number of occurrence: 1

Retrying in 30 seconds...

You will see the above event once a running Correlation engine initially detects that the Exchange 2010 MP has been imported, after issuing errors that the MP was not found.

 

Log Name:      Application
Source:        MSExchangeMonitoringCorrelation
Date:          10/15/2010 9:41:41 AM
Event ID:      714
Task Category: General
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      OMRMS.opsmgr.net
Description:
Cannot connect to Operations Manager Root Management Server.

Error message: The OpsMgr SDK Service is not running on Operations Manager Root Management Server 'localhost'.

The above event is fired whenever the Correlation Engine can connect to the RMS computer, but the SDK service is not running, or not fully started.

Comments
  • Great article Kevin!  This is fantastic information.

  • Kevin - If we don't have clustered RMS, is it still recommended to host the correlation engine on RMS or on a MS?

  • @Ramesh -

    It doesnt make a lot of difference.  Whichever box is beefier, I would say.  Generally this will be the RMS.  We prefer to run it somewhere that has VERY good connectivity to the SDK service, which will be running on the RMS.  Therefore - the RMS would be my first choice in most environments.  If you have a MS positioned for standy use, low utilization, physical hardware, same datacenter/LAN segment as the RMS, then I might consider that server over the RMS.  But in most cases the RMS is a good candidate for this service.

  • Great Help! I doesn't miss a beat in helping me install the correlation engine.

  • Kevin,

    Is it possible to move the Alert Correlation Engine to another server. I have it installed on my Server 2003 RMS. I have a new Server 2008 MS and I want this one to become the RMS.

  • Thanks this helped me configure the service on a non RMS server. I didn't realize there was a config file.  Works great now ;)

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
Search Blogs