Shadow Redundancy Scenarios

Shadow redundancy minimizes message loss due to server outages. When a transport server comes back online after an outage, there are two scenarios:

  • The server comes back online with a new transport database – In this scenario, the transport database is unrecoverable due to data corruption or hardware failure. In this case, because the transport server will have a new database ID, it will be recognized as a new route by the other transport servers in the organization. This also applies to the situation where a server couldn't be recovered, and a new server was provisioned as a replacement.
  • The server comes back online with the same transport database – In this scenario, the particular transport server didn't fail, but was offline for an extended period of time. For example, a network card failure, or a long maintenance on the server would cause this scenario.

The following table summarizes how transport reacts to these two scenarios when shadow redundancy is enabled. For clarity, assume that the server that had an outage is named Hub01.

Recovery scenario

Alternative Routes

No Alternative Routes

Hub01 comes back online with a new database.

When Hub01 becomes unavailable, each server that has shadow messages queued for Hub01 will assume ownership of those messages and resubmit them. The messages then get delivered to their destinations using alternative routes.

The total delay for messages is equal to the product of the heartbeat time-out interval and the heartbeat retry count configured in your organization.

These messages remain in the shadow queue on each server that has shadow messages queued for Hub01. When Hub01 comes back online with a new database ID, the shadow servers detect that it's a new database and resubmit the messages that are in the shadow queue to Hub01. This is equivalent to suddenly discovering an alternative route for these messages.

The total delay for the messages depends on the duration of the outage.

Hub01 comes back online with the same database.

Hub01 will deliver the messages in its queues. This will result in duplicate delivery of these messages. Exchange mailbox users won't see duplicate messages due to duplicate message detection. However, recipients on foreign systems may receive duplicate copies.

The total delay for messages is equal to the product of the heartbeat time-out interval and the heartbeat retry count configured in your organization.

Hub 01 will deliver the messages in its queues and then send discard notifications to the shadow servers.

The total delay for the messages depends on the duration of the outage.

However regarding duplicate messages we should always refer this very interest link which actually applies since previous versions and which explains how Exchange can in some scenarios avoid message duplication:

https://msexchangeteam.com/archive/2004/07/14/183132.aspx