Replication storms are a major and relatively common problem associated with public folders. A replication storm occurs when a large amount of data is replicated among public folder servers, typically a consequence of a change affecting many items or folders. The problem is particularly upsetting when the changes triggering the replication storm are unintended and the network connections are low bandwidth/high latency.
The stop-resume content replication feature present once Exchange 2003 SP2 is installed allows the administrator to stop an unintended replication storm and reverse the settings that caused it before content replication is resumed.
How it works
If content replication is not already stopped: Administrator right-clicks Organization object and sees "Stop Public Folder Content Replication" task at the top of the menu (under the Internet Mail Wizard... item):
Clicking Stop Public Folder Content Replication shows the following message box:
If content replication is already stopped: When the Admin right-clicks the Organization object they see "Resume Public Folders Content Replication" at the top of the menu (only the task that makes sense to execute is displayed in the menu).
Choosing "Resume Public Folder Content Replication" brings up the following message box.
All public folder content replication for all folders in all public stores on all servers in your organization will resume shortly. It may take some time for this setting to take effect. Continuing may result in a substantial amount of network traffic as each public folder store catches up. Continue?
What does the Stop content replication task do?
- Applies to all servers in the organization.
- Makes servers stop returning requested content data (stops satisfying backfill requests). Servers will still issue backfill requests but they are not going to be answered.
- On the first request to change to public folder content, each server logs one informational event within 15 minutes (at default logging level):
Event Type: Information
Event Source: MSExchangeIS Public Store
Event Category: Replication General
Event ID: 3118
Public folder content broadcasts will not occur because public folder content replication has been disabled in the organization.
When content replication is stopped, servers that do not support this feature (any pre-Exchange 2003 SP2 servers) will behave as follows:
- They will continue broadcasting changes to their contents
- They will continue returning requested content data (will continue to satisfy backfill requests)
What does the Resume content replication task do?
- Applies to all servers in the organization
- Makes servers resume content replication according to their existing schedules
Each server logs one informational event within 15 minutes (at default logging level):
Event ID: 3119
Public folder content broadcasts will now once again occur because public folder content replication has been reenabled in the organization.
Note: When content replication is stopped, all servers (those who support the new feature and those who do not) continue sending backfill requests, which will occur based on the standard backfill schedule as follows:
Within a site
First backfill retry
Subsequent backfill retries
The "Stop Public Folder Content Replication" task only blocks the transmission of public folder content (either broadcast or backfill fulfillment). The feature only causes a public folder content replication mail, which has gone thru all the effort of being packed up, to be immediately dropped (simulating immediate loss of the replication e-mail by transport). Change numbers and so on are all updated as if the mail had actually been sent.
It should also be understood that when public folder content replication is turned back on, the amount of time until replication "settles" might be up to 48 hours (because of the above backfill schedule). So, in typical use of this feature, we would stop the content replication, make any changes to the replica lists (assuming that the replica lists change was the cause of replication storm that we want to stop), ensure that the replica list change (which is a change in the hierarchy) makes it to all servers in the organization and then turn the content replication back on. Depending on the backfill schedule, a 48-hour period might need to pass until replication gets "sorted out."
Hope this was useful. You might not have known that this setting was even there unless you right-clicked on the Organization object!
- Nino Bilic