A customer recently contacted us about how to recover from data corruption on a volume where replicated content was stored. He writes: “We have recently experienced data corruption on our hub server. This data was on a large volume, and I want to avoid a full restore. The data on the volume is collected from a few or our regional servers using DFS Replication.  I want to use DFS Replication to ensure that all of the data on the hub server is updated with the “good” data from a regional server. Can I force a sync on a replicated folder, with a designated primary member, without deleting and recreating the replicated folder?”

Our DFS guru Ram Natarajan provided the following response to the customer:

There are 3 different ways that I can think of to do what you intend:

  1. Delete and re-create the replication group – you will get to pick a new primary member and you can pick the member with “good” data as the primary.  It is important to make sure that after deleting the replication group configuration from Active Directory, this change is picked up by all relevant members.  The PollDsNow WMI method can be used to force Active Directory polls, but due to Active Directory replication latencies, there is no guarantee on if/when this change will be picked up by a given poll cycle.  The event logs will show event ID 3006 for the relevant RG GUID if the deletion has been picked up by the member, and this can be used as the verification that the change has been picked up.
  2. Disable membership to all replicated folders (can be restricted to a subset of replicated folders on a given volume if only one volume got corrupted) for the member which has the data corruption and re-enable the membership.  Again, it is important to make sure that the member in question picks up the disable first – use event ID 4114 for each of the relevant replicated folders as confirmation that the disable change has been picked up by the member and then re-enable.  Note that, in this case there is no need to pick a new primary – assuming at least one of the partners is already out of initial sync, that partner will automatically start syncing with this member when it gets re-enabled and all content on this member will lose all conflicts during its initial sync.
  3. You can fence all the (affected) files on the corrupt member to lose on conflicts and let it sync normally.  Use wmic/wbemtest to invoke the Fence method in DfsrReplicatedFolderInfo to accomplish this – This method takes 3 parameters – The path to fence, a Boolean to indicate whether the fence is recursive and an integer (enumerated value – you can see the mapping in the MOF file) that indicates the fence value.

The following wmic command shows an example of how to invoke this method:

wmic /namespace:\\root\microsoftdfs path dfsrreplicatedfolderinfo call Fence "e:\\cs1",TRUE,1

Where e:\cs1 is the replicated folder root (note that you have to escape ‘\’) and 1 is the fence value “Initial Sync” (loses to partner on conflicts).

The valid fence values that you can use with the above call are:

  • Values {"Initial Sync", "Initial Primary", "Default", "Fence"}
  • ValueMap {"1", "2", "3", "4"}

For all the above three methods, the prerequisite is to prevent DFSR on the affected member from replicating out files inadvertently with any of the partners while the recovery procedure is carried out. One safe and easy way of doing this would be to configure DFSR to replicate on a static port, and then block that port on the firewall on that machine – this will prevent DFSR on that machine from serving out any files while you complete the recovery procedure.

--Ram