Replacing DFSR Member Hardware or OS (Part 2: Pre-seeding)

Replacing DFSR Member Hardware or OS (Part 2: Pre-seeding)

  • Comments 18
  • Likes

Ned here again. Previously I discussed options for performing a hardware or OS replacement within an existing DFSR Replication Group. As part of that process you may end up seeding a new server’s disk with data from an existing server. Pre-seeded files exactly match the copies on an upstream server, so that when initial non-authoritative sync is performed no data will be sent over the network except the SHA-1 hash of each file for confirmation. For a deeper explanation of pre-seeding review:

In order to make this more portable I decided to make this a separate post within the series. Even if you are not planning a file server migration and just want to add some new servers to a replica with pre-seeding, the techniques here will be useful. I demonstrate how to pre-seed from Windows Server 2003 R2 to Windows Server 2008 R2 as this is the common scenario as of this writing. I also call out the techniques needed for other OS arrangements, and I will use both kinds of Windows backup software as well as robocopy in my techniques.

Huge Update!!! We finally have a TechNet article on DFSR Preseeding! It's here! It's called Copying Files to Preseed or Stage Initial Synchronization! Go go go!!! Goooooo!

There are three techniques you can use:

  • Pre-seeding with NTBackup
  • Pre-seeding with Robocopy
  • Pre-seeding with Windows Server Backup

The most important thing is to TEST. Don’t be a cowboy or get sloppy when it comes to pre-seeding; most cases we get with massive conflict problems were caused by lack of attention to detail during a pre-seeding that took a functional environment and broke it.

Read-Only Pre-Seeding

If using Windows Sever 2008 R2 and planning on using Read-Only replication, make sure you install the following hotfix before configuring the replicated folder:

An outgoing replication backlog occurs after you convert a read/write replicated folder to a read-only replicated folder in Windows Server 2008 R2 - http://support.microsoft.com/kb/2285835

This prevents a (cosmetic) issue where DFSR displays pre-seeded files as an outbound backlog on a read-only replicated folder. A read-only member cannot have an outbound backlog, naturally.

Pre-seeding with NTBackup

If your data source OS is Windows Server 2003 R2, I recommend you use NTBackup.exe for pre-seeding. NTBackup correctly copies all aspects of a file including data, security, attributes, path, and alternate streams. It has both a GUI and command-line interface.

Prerequisites

If pre-seeding from Windows Server 2003 R2 to Windows Server 2003 R2, no special changes have to be made. If pre-seeding from Windows Server 2003 R2 to Windows Server 2008 or Windows Server 2008 R2, you will need to download an out-of-band version of NTBackup to restore the data:

More info on using NTBackup: http://support.microsoft.com/kb/326216/pl

Critical note: Restoring an entire volume (rather than specific folders like demonstrated below) with NTBACKUP will cause all existing replicated folders on that volume to go into non-authoritative sync. For that reason you should never restore an entire volume if you are already using DFSR on a server volume being pre-seeded. Just restore the replicated folders like I do in the examples.

Procedure

1. Start NTBackup.exe on the Windows Server 2003 R2 DFSR computer that has the data you are going to pre-seed elsewhere.

2. Select the Replicated Folder(s) you are going to pre-seed. In the example below I have two RF’s on my E: drive:

image

Note: When selecting the replicated folders, you can optionally de-select the DFSRPRIVATE folders underneath them to save time and space in the backup.

3. Backup to a flat file format (locally, if you have the disk capacity).

4. When the backup is complete, copy that file over to your new server that is going to replicate this data in the future. If the server is Win2008 or Win2008 R2, make sure you have the NT Restore tool installed.

Note: very large files – such as NTBackup BKF files that are hundreds of GB – can be copied much faster over a gigabit LAN by using tools that support unbuffered IO. A few Microsoft-provided options for this are:

5. Start the NTBackup tool on your new DFSR server that you are pre-seeding.

image

6. Select to restore data. In the Win2008/R2 restore tools, this is the only option available.

7. Select the backup file, then drill down into the backed up files so that you select the parent folders containing all the user data.

image

Note: You may need to select “Tools”, then “Catalog a backup file” to select a backup to restore.

image

8. Change the “Restore files to:” dropdown to “Alternate Location”

9. Specify the “Alternate Location” path to match what it should be on the new server. In my case the replicated folders had existed on the root of the drive, so I restored them to the root of the new servers data drive (E:\).

image

Note: By default the security and mount points will be restored. Security must be restored or file hashes will change and the pre-seeding operation will fail. DFSR doesn’t replicate junction points so there is no need to check that box.

image

10. At this point you are done pre-seeding. See section Validating Pre-Seeding. When that is complete you can proceed with replicating the data. You have the option to delete the DFSRPrivate folder that was restored within your RF(s) at this point, as it will not be useful for pre-seeding.

Pre-seeding with Robocopy

If your data source OS is Windows Server 2008, I recommend you use Robocopy for pre-seeding. While Windows Server 2008 supports Windows Server Backup, it lacks granularity in backing up files. Robocopy can also be used on the other operating systems but it is not as recommended as using a backup.

Prerequisites

Robocopy is included with Windows Vista and later, but there have been subsequent hotfix versions that are required for correct pre-seeding. It is not included with Windows Server 2003. You must install the following on your computer that will be pre-seeded, based on your environment (there is no reason to install on the server that currently holds the old data files):

  • Download latest Windows Server 2008 R2 Robocopy (KB979808 or later - current latest as of this update is KB2639043)
  • Download latest Windows Server 2008 Robocopy (KB973776 or later)
  • Download Windows Server 2003 robocopy (2003 Resource Kit)

Note: Again, it is not recommended that you pre-seed a new Windows 2003 R2 computer using Robocopy.exe as there are known pre-seeding issues with the version included in the out-of-band Windows Resource Kit Tools. These issues will not be fixed as Win2003 is out of mainstream support. You should instead use NTBackup.exe as described previously.

More info on using robocopy: http://technet.microsoft.com/en-us/library/cc733145(WS.10).aspx

Procedure

1. Logon to the computer that is being pre-seeded with data from a previous DFSR node. Make sure you have full Administrator rights on both computers.

2. Validate that the Replicated Folders that you plan to copy over do not yet exist on the computer being pre-seeded.

Critical note: do not pre-create the base folders that robocopy is copying and copy into them; let robocopy create the entire source tree. Under no circumstances should you change the security on the destination folders and files after using robocopy to pre-seed the data as robocopy will not synchronize security if the files data stream matches, even when using /MIR.

Consider robocopy a one-time option. If you run into some issue with it, delete all the data on the destination and re-run the robocopy commands. Do not try to “fix” the existing data as you are very likely to make things worse.

image

3. Sync the folders using robocopy with the following argument format:

Robocopy.exe “\\source server\drive$\folder path” “destination drive\folder path” /b /e /copyall /r:6 /xd dfsrprivate /log:robo.log /tee

For example:

image

Note: You have the option to use the multi-threaded /MT option starting in the Win2008 version of Robocopy to copy more than one file at a time. The downside of /MT is that you cannot easily see copy progress.

Note: You also have the option to use the /LOG option to redirect all output to a file for later review. This is useful to see more specifics about errors if encountered. The downside is that you will see no console progress.

image

Note: These arguments use a backup API that can copy most in-use file types (/b), include subfiles and folders (/e), copy all aspects of a file (/copyall), retry 6 times if a file copy errors (/r:6), excludes folders called Dfsrprivate (/xd dfsrprivate), writes to a log (/log:robo.log), and also outputs to console (/tee). This DfsrPrivate exclusion can be changed to a full path as well if you suspect this is a legitimate user data folder name deeper in the Replicated Folder (typically it is not; if any copies exist they are usually from previously replicated folders that should have been cleaned up by a file server administrator).

4. When the copy completes, validate that there were no errors and that only one folder was skipped (that will be the DFSRPrivate folder).

Note: if you find FAILED entries, you can review the log for specifics.

5. At this point you are done pre-seeding. See section Validating Pre-Seeding. When that is complete you can proceed with replicating the data.

Pre-seeding with Windows Server Backup

If your data source OS is Windows Server 2008 R2, I recommend you use Windows Server Backup (WSB) for pre-seeding. WSB correctly copies all aspects of a file including data, security, attributes, path, and alternate streams. It has both a GUI and command-line interface. I do not recommend WSB on Windows Server 2008 non-R2, as it lacks granularity in backing up files – refer to the Robocopy section of this article if your source computers are Win2008 non-R2.

Prerequisites

Windows Server Backup must be installed as a feature on the DFSR computers; it is not available by default. This can be done through ServerManager.msc or DISM.EXE.

More info on using Windows Server Backup: http://technet.microsoft.com/en-us/library/ee849849(WS.10).aspx

Procedure

1. Start Wbadmin.msc on the Windows Server 2008 R2 DFSR computer that has the data you are going to pre-seed.

2. Select “Backup Once” and then under “Select Backup Configuration” choose “Custom”.

image

3. Use “Add Items” to select the replicated folders that you will be pre-seeding.

image

Note: Do not attempt to exclude the DFSRPrivate junction point folders, as you will receive an error “one of the file paths specified for backup is under a reparse point”.

4. Select where to store the backup. This can be local if you have another disk with enough capacity, or a remote network location. It cannot be the same drive as the replicated folders being backed up.

image

5. If the backup was done locally, copy the WindowsImageBackup folder containing your backup to the location where you will restore the data. It could be a disk on the server you are pre-seeding or a central file share. It cannot be the actual disk(s) you are going to restore data to on the new computer.

6. Start Windows Server Backup on your server that you are pre-seeding with data and select “Recover”.

7. Select “A backup stored on another location”.

8. Select the correct location type. If the file was saved to this server, select “Local drives” and if it’s on another file share choose “Remote shared folder”.

9. You will see the old source data server in the list. Select the server and proceed.

image

10. The backup dates will be listed. By default the most recent will be displayed and this should be your backup; if not choose the correct one.

image

11. Select “Files and Folders” for the “Recovery Type”.

12. For “Items to Recover”, select the server node in “Available Items” tree. Whatever folder you select here, all of its child objects will be restored. For example, here I had two replicated folders on this server at the root of the drive that I backed up. If I just restore the “E” drive backup contents, both folders will be restored.

image

13. Under “Specify Recovery Options” select the destination path. Set “Overwrite the existing versions with the recovered versions”. Make sure that “restore access control list…” is enabled (i.e. checked ON).

image

Note: There should be no existing data to overwrite in this scenario typically; this radio button is selected for completeness. Pre-seeded data should win, that is why you are using it; existing data cannot be trusted.

14. Restore the data by selecting “Recover”.

15. At this point you are done pre-seeding. See section Validating Pre-Seeding. When that is complete you can proceed with replicating the data. You have the option to delete the DFSRPrivate folder that was restored within your RF(s) at this point, as it will not be useful for pre-seeding.

Validating Pre-seeding

Having theoretically pre-seeded correctly at this point, you need to spot check your work and validate that the file hashes are matching on the server. If a half dozen match up, you are usually safe to assume all the rest worked out – validating every single file is possible but in a large data set it will be very time consuming and of little value.

Prerequisites

You must have a Windows 7 or Windows Server 2008 R2 computer somewhere in your environment (even if it is not part of the DFSR environment being migrated) as it includes a new version of DFSRDIAG.EXE that has a filehash checking tool. If you do not have at least a Windows 7 computer running RSAT you will not be able to properly validate SHA-1 DFSR file hash data.

  • If using Win7, install RSAT and add the Distributed File System tools.

  • If using Win2008 R2 servers, add the Feature of Distributed File System tools.

image

Note: If you have no copy of Windows 7 you must open a support case in order to gain access to an unsupported internal tool for file hash checking. The cost of this support case is at least the same as a copy of Windows 7 though and the tool you are provided will receive no support, so this is not as advisable as purchasing one Win7 license.

More info on using DFSRDIAG FILEHASH: http://blogs.technet.com/b/filecab/archive/2009/01/19/dfs-replication-what-s-new-in-windows-server-2008-r2.aspx

Procedure

1. Note the path of six files within the source data server. These should be scattered throughout various nested folder trees.

2. For one of those test files, use DFSRDIAG.EXE to get a hash from the source computer and the matching file on the pre-seeded computer:

DFSRDIAG.exe filehash /path:”source computer path file”

DFSRDIAG.exe filehash /path:”pre-seeded computer path file”

For example:

image

3. If DFSRDIAG shows the same hash value for both copies of the file, it has been pre-seeded correctly and matches in all file aspects (data stream, alternate data stream, security, and attributes). If it doesn’t match, you made a mistake in your pre-seeding or someone has changed the files after the fact. Start over.

4. Repeat for five more files (or more until you feel comfortable that pre-seeding was done perfectly).

Note: If you want to check every file, consider using DIR /B to build a list of all files on both servers, then using a FOR loop to export the hashes from all of them. But expect to wait a long time. 

Update 03/04/2011: Paul Fragale has written a DFSRDIAG FILEHASH powershell script that does automated spot checking for you. Grab it here: http://gallery.technet.microsoft.com/scriptcenter/1de44cc1-ce79-4e98-9283-92548fc02af9 

Final Considerations

Keep in mind that unless your data is 100% static or users are not allowed to modify files during pre-seeding and DFSR initial sync, some file conflicts are to be expected. These will be visible in the form of DFSR Event Log 4412 entries on the server that was pre-seeded. The point of pre-seeding is to minimize the amount of data to be replicated initially during the non-authoritative replication phase on the downstream server; unless data never changes there will always be a delta that DFSR will have to catch up after pre-seeding.

Series Index

- Ned “beanstack” Pyle

  • Hello,

    Your article is very instructive. However I have a hard time pre-seeding a new server with robocopy.

    I can't manage to get a same hash on source server (Win 2003 R2) and destination server (Win 2008 R2). I know the files aren't change during the process (testing on a small archive folder). I try to follow your procedure but every time hash doesn't match :

    - running from Win 2008 R2

    - robocopy \\data01\n$\packteam e:\packteam /b /e /copyall /r:3 /xd dfsrprivate

    - dfsrdiag filehash /path:"\\data01\n$\PackTeam\_Archive\planche01.jpg"

      File Hash: B2AB0CBC-AE0E8461-0DB614E7-6448E540

      Operation Succeeded

    - dfsrdiag filehash /path:"e:\packteam\_Archive\planche01.jpg"

      File Hash: FB8872ED-4009A2F7-C0F35394-B207AB0A

      Operation Succeeded

    Any idea about this? Thank you.

  • Did you update the version of robocopy.exe with KB: support.microsoft.com/default.aspx

    ?

    If you don't update it, you will have the problem you decribed.

  • Hum...I'm very sorry because I missed that one. And you did wrote it in your article.

    Now it's working perfectly! Thank you so much for your time.

  • I'v run into the same problem. I am running the robocopy command from my new target server, which is Storage Server Standard 2008 SP2. I tried to apply the hotfix mentioned above but it stated the "update does not apply". The other members of the replication group I'm attempting to join are 2003 std R2. Another odd thing I've run into- although I can run DFSRDIAG with the filehash switch on my Windows 7 workstion, That parameter is not recognized on the 2008 storage server or the 2003 servers.

  • Tell me what version of Robocopy you see currently on your Storage Server Edition. It's quite possible that the robocopy update will not install on that mnachine because storage server is a special SKU that is maintained by the OEM you bought it from. NOT by Microsoft. So if it will not install, you need to contact the vendor of that server and get them to fix it for you. If they cannot or will not assist you, you can install that robocopy update on a non-storage server edition or on a Win7 client, and run the commands remotely.

    The other issue is expected - DFSRDIAG FILEHASH only started existing in Win2008 R2 and Win7. That's why the article says this only works on Win7 and R2. :)

  • Hello

    A very good article. I have a question I'd like to ask.

    For space reasons, I want to move DFS shared folders to another larger hard drive on the same Windows 2003 R2.

    If I follow all the steps: Stop the services, make NTbackup to folders , restore folders on the new drive, check the hash of the files ...

    Could I remove the old drive and assign the drive letter that had the old unit to the new drive? Does DFS (and reply) work without having to touch the DFS config?

    Thank you very much

  • You could re-use the same drive leeter if you wanted for personal convenience, but DFSR is going to have to reintialize all that data no matter what. It keeps track of drive signature and has a hidden database on that volume; so, it cannot be tricked. :)

    It is going to have to do a new sync here no matter what - just a question of whether you minimize time with pre-seeding correctly.

  • It sounds like we should add support.microsoft.com/.../2285835 to the recommended hotfixes KB :)

  • Your article is excellent, thank you.  In our environment, a few files will change after backing up.  I am confused.  Can I still preseed?  How will DFS-r handle the changes?

    Thank you.

  • Hi,

    The delta of changes will replicate to the new server. The backup is mainly to cover the 99% of files that won't be changing and save you that replication.

  • FYI, the updated 2008R2 Robocopy has been included in 2008R2 SP1.

    www.microsoft.com/.../details.aspx

  • Indeed it has! But you already knew that as a frequent AskDS subscriber...

    blogs.technet.com/.../sp1-and-directory-services-what-s-new.aspx

    ;-P

  • Great article Ned.  What about a process where we need to add a large amount of data to an existing DFS Replication Group?  Do we remove and re-create the replication group and pre-seed it?

  • Define a "large amount". :)

    Tha sounds like overkill, especially since it's new data that no one has ever accessed before, so they won't mind it not being on both servers for awhile...

  • My situation is a little more unique. We have a remote site which replicates to our corporate office via old RSYNC and these are both Novell servers. Yes, we are still on Novell… We recently moved the remote site to a Windows 2003 R2 server and, like other Windows Server based remote sites, we will use DFSr to backup the data nightly. So my Pre-seeded data is from a Novell server. I want to just “copy” this data to the Corporate (Hub) server and then create a Replication group using this pre-seeded data. There is only two servers in the replication group, the Branch office and our Corporate office and the data flows from Branch  to Corporate for backups only. So can someone help me understand any additional steps I need to perform to get this replicating properly. The Pre-seeded data I have is about 160GB and is only a couple days old from the branch office, so I assume, if I can get DFSr to see this data correctly, it will not have to replicate all the data back over a slow WAN link. I’m also fearing that the actual “updated”(newer) file at the branch office will somehow disappear or move to some system folder, like ‘conflict and resolution’ folder if not done right.