Exchange Server 2013 introduces a new feature called Managed Availability, which is a built-in monitoring system with self-recovery capabilities.

If you’re not familiar with Managed Availability, it’s a good idea to read these posts:

As described in the above posts, Managed Availability performs continuous probing to detect possible problems with Exchange components or their dependencies, and it performs recovery actions to make sure the end user experience is not impacted due to a problem with any of these components.

However, there may be scenarios where the out-of-box settings may not be suitable for your environment. This blog post guides you on how to examine the default settings and modify them to suit your environment.

Managed Availability Components

Let’s start by finding out which health sets are on an Exchange server:

Get-HealthReport -Identity Exch2

This produces the output similar to the following:

image

 

Next, use Get-MonitoringItemIdentity to list out the probes, monitors, and responders related to a health set. For example, the following command lists the probes, monitors, and responders included in the FrontendTransport health set:

Get-MonitoringItemIdentity -Identity FrontendTransport -Server exch1 | ft name,itemtype –AutoSize

This produces output similar to the following:

image

You might notice multiple probes with same name for some components. That’s because Managed Availability creates a probe for each resource. In following example, you can see that OutlookRpcSelfTestProbe is created multiple times (one for each mailbox database present on the server).

image

 

Use Get-MonitoringItemIdentity to list the monitoring Item Identities along with the resource for which they are created:

Get-MonitoringItemIdentity -Identity Outlook.Protocol -Server exch1 | ft name,itemtype,targetresource –AutoSize

image

 

Customize Managed Availability

Managed Availability components (probes, monitors and responders) can be customized by creating an override.

There are two types of override: local override and global override. As their names imply, a local override is available only on the server where it is created, and a global override is used to deploy an override across multiple servers.

Either override can be created for a specific duration or for a specific version of servers.

Local Overrides

Local overrides are managed with the *-ServerMonitoringOverride set of cmdlets. Local overrides are stored under following registry path:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ExchangeServer\v15\ActiveMonitoring\Overrides\

The Microsoft Exchange Health Management service reads this registry path every 10 minutes and loads configuration changes. Alternatively, you can restart the service to make the change effective immediately.

You would usually create a local override to:

  • Customize a managed availability component that is server-specific and not available globally; or
  • Customize a managed availability component on a specific server.

Global Overrides

Global overrides are managed with the *-GlobalMonitoringOverride set of cmdlets. Global overrides are stored in the following container in Active Directory:

CN=Overrides,CN=Monitoring Settings,CN=FM,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=Contoso,DC=com

Get Configuration Details

The configuration details of most of probes, monitors, and responders are stored in the respective crimson channel event log for each monitoring item identity, you would examine these first before deciding to change.

In this example, we will explore properties of a probe named “OnPremisesInboundProxy”, which is part of the FrontendTransport Health Set. The following script lists detail of the OnPremisesInboundProxy probe:

(Get-WinEvent -LogName Microsoft-Exchange-ActiveMonitoring/ProbeDefinition | % {[XML]$_.toXml()}).event.userData.eventXml | ?{$_.Name -like "OnPremisesInboundProxy"}

You can also use Event Viewer to get the details of probe definition. The configuration details most of the Probes are stores in the ProbeDefinition channel:

  1. Open Event Viewer, and then expand Applications and Services Logs\Microsoft\Exchange\ActiveMonitoring\ProbeDefinition.
  2. Click on Find, and then enter OnPremisesInboundProxy.
  3. The General tab does not show much detail, so click on the Details tab, it has the configuration details specific to this probe. Alternatively, you can copy the event details as text and paste it into Notepad or your favorite editor to see the details.

Override Scenarios

Let’s look at couple real-life scenarios and apply our learning so far to customize managed availability to our liking, starting with local overrides.

Creating a Local Override

In this example, an administrator has customized one of the Inbound Receive connectors by removing the binding of loopback IP address. Later, they discover that the FrontEndTransport health set is unhealthy. On further digging, they determine that the OnPremisesInboundProxy probe is failing.

To figure out why the probe is failing, you can first list the configuration details of OnPremisesInboundProxy probe.

(Get-WinEvent -LogName Microsoft-Exchange-ActiveMonitoring/ProbeDefinition | % {[XML]$_.toXml()}).event.userData.eventXml | ?{$_.Name -like "OnPremisesInboundProxy"}

Name : OnPremisesInboundProxy

WorkItemVersion : [null]

ServiceName : FrontendTransport

DeploymentId : 0

ExecutionLocation : [null]

CreatedTime : 2013-08-06T12:54:29.7571195Z

Enabled : 1

TargetPartition : [null]

TargetGroup : [null]

TargetResource : [null]

TargetExtension : [null]

TargetVersion : [null]

RecurrenceIntervalSeconds : 300

TimeoutSeconds : 60

StartTime : 2013-08-06T12:54:36.7571195Z

UpdateTime : 2013-08-06T12:48:27.1418660Z

MaxRetryAttempts : 1

ExtensionAttributes : <ExtensionAttributes><WorkContext><SmtpServer>127.0.0.1</SmtpServer><Port>25</Port><HeloDomain>InboundProxyProbe</HeloDomain><MailFrom Username="inboundproxy@contoso.com"/><MailTo Select="All" Username="HealthMailboxdd618748368a4935b278e884fb41fd8a@FM.com"/><Data AddAttributions="false">X-Exchange-Probe-Drop-Message:FrontEnd-CAT-250 Subject:Inbound proxy probe</Data><ExpectedConnectionLostPoint>None</ExpectedConnectionLostPoint></WorkContext></ExtensionAttributes>

The ExtentionAttributes property above shows that the probe is using 127.0.0.1 for connecting to port 25. As that is the loopback address, the administrator needs to change the SMTP server in ExtentionAttributes property to enable the probe to succeed.

You use following command to create a local override, and change the SMTP server to the hostname instead of loopback IP address.

Add-ServerMonitoringOverride -Server ServerName -Identity FrontEndTransport\OnPremisesInboundProxy -ItemType Probe -PropertyName ExtensionAttributes -PropertyValue '<ExtensionAttributes><WorkContext><SmtpServer>Exch1.contoso.com</SmtpServer><Port>25</Port><HeloDomain>InboundProxyProbe</HeloDomain><MailFrom Username="inboundproxy@contoso.com" /><MailTo Select="All" Username="HealthMailboxdd618748368a4935b278e884fb41fd8a@FM.com" /><Data AddAttributions="false">X-Exchange-Probe-Drop-Message:FrontEnd-CAT-250Subject:Inbound proxy probe</Data><ExpectedConnectionLostPoint>None</ExpectedConnectionLostPoint></WorkContext></ExtensionAttributes>' -Duration 45.00:00:00

The probe will be created on the specified server in following registry path:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ExchangeServer\v15\ActiveMonitoring\Overrides\Probe

You can use following command to verify if the probe is effective:

(Get-WinEvent -LogName Microsoft-Exchange-ActiveMonitoring/ProbeDefinition | % {[XML]$_.toXml()}).event.userData.eventXml | ?{$_.Name -like "OnPremisesInboundProxy"}

Creating a Global Override

In this example, the organization has an EWS application that is keeping the EWS app pools busy for complex queries. The administrator discovers that the EWS App pool is recycled during long running queries, and that the EWSProxyTestProbe probe is failing.

To find out the details of EWSProxyTestProbe, run the following:

(Get-WinEvent -LogName Microsoft-Exchange-ActiveMonitoring/ProbeDefinition | % {[XML]$_.toXml()}).event.userData.eventXml | ?{$_.Name -like "EWSProxyTestProbe"}

Next, change the timeout interval for EWSProxyTestProbe to 25 seconds on all servers running Exchange Server 2013 RTM CU2:

  1. Use following command to get version information for Exchange 2013 RTM CU2 servers:

    Get-ExchangeServer | ft name,admindisplayversion

  2. Use following command to create a new global override:

    Add-GlobalMonitoringOverride -Identity "EWS.Proxy\EWSProxyTestProbe" -ItemType Probe -PropertyName TimeoutSeconds -PropertyValue 25 –ApplyVersion “15.0.712.24”

Override Durations

Either of above mentioned overrides can be created for a specific Duration or for a specific Version of Exchange servers. The override created with Duration parameter will be effective only for the period mentioned. And maximum duration that can be specified is 60 days. For example, an override created with Duration 45.00:00:00 will be effective for 45 days since the time of creation.

The Version specific override will be effective as long as the Exchange server version matches the value specified. For example, an override created for Exchange 2013 CU1, with version “15.0.620.29” will be effective until the Exchange server version changes. The override will be ineffective if the Exchange server is upgraded to different version of Cumulative Update or Service Pack.

Hence, if you need an override in effect for longer period, create the override using ApplyVersion parameter.

Removing an Override

Finally, this last example shows how to remove the local override that was created for the OnPremisesInbounxProxy probe.

Remove-ServerMonitoringOverride -Server ServerName -Identity FrontEndTransport\OnPremisesInboundProxy -ItemType Probe -PropertyName ExtensionAttributes

 

Conclusion

Managed Availability performs gradual recovery actions to automatically recover from failure scenarios. The overrides help you customize the configuration of the Managed Availability components to suite to your environment. The steps mentioned in the document can be used to customize Monitors & Probes as required.

Special thanks to Abram Jackson, Scott Schnoll, Ben Winzenz, and Nino Bilic for reviewing this post. Smile

Bhalchandra Atre