Microsoft's official enterprise support blog for AD DS and more
[Note from Ned: this article was created and vetted by the Microsoft development teams for DFS Replication, DFS Namespaces, Offline Files, Folder Redirection, Roaming User Profiles, and Home Folders. Due to some TechNet publishing timelines, it was decided to post here in the interim. This article will become part of the regular TechNet documentation tree at a later date. The primary author of this document is Mahesh Unnikrishnan, a Senior Program Manager who works on the DFSR, DFSN, and NFS product development teams. You can find other articles by Mahesh at the MS Storage Team blog: http://blogs.technet.com/b/filecab.
The purpose of this article is to clarify exactly which scenarios are supported for user data profiles when used with DFSR, DFSN, FR, CSC, RUP, and HF. It also provides explanation around why the unsupported scenarios should not be used. When you finish reading this article I recommend reviewing http://blogs.technet.com/b/askds/archive/2009/02/20/understanding-the-lack-of-distributed-file-locking-in-dfsr.aspx
Update 4/15/2011 - the DFSR development team created a matching KB for this - http://support.microsoft.com/kb/2533009]
Consider the following illustrative scenario. Contoso Corporation has two offices – a main office in New York and a branch office in London. The London office is a smaller office and does not have dedicated IT staff on site. Therefore, data generated at the London office is replicated over the WAN link to the New York office for backup.
Contoso has deployed a file server in the London branch office. User profiles and redirected home folders are stored on shares exported by that file server. The contents of these shares are replicated to the central hub server in the New York office for centralized backup and data management. In this scenario, a DFS namespace is not configured. Therefore, users will not be automatically redirected to the central file server if the London file server is unavailable.
As illustrated by the diagram above, there is a file server hosting home folders and user profile data for all employees in Contoso’s London branch office. The home folder and user profile data is replicated using DFS Replication from the London file server to the central file server in the New York office. This data is backed up using backup software like Microsoft’s System Center Data Protection Manager (DPM) at the New York office.
Note that in this scenario, all user initiated modifications occur on the London file server. This holds true for both user profile data and the data stored in users’ home folders. The replica in the New York office is only for backup purposes and is not being actively modified or accessed by users.
There are a few variants of this deployment scenario, depending on whether a DFS Namespace is configured. Following sub-sections detail these deployment variants and specify which of these variants are supported.
This is a variation of the above scenario, with the only difference being that DFS Namespaces is set up to create a unified namespace across all shares exported by the branch office file server. However, in this scenario, all namespace links must have only one target1 - the share hosted by the branch office file server.
1 Deployment scenarios where namespace links have multiple targets are discussed later in this document.
Both variants of this deployment scenario are supported. The key point to remember for this deployment scenario is that only one copy of the data is actively modified and used by client computers, thereby avoiding issues caused by replication latencies and users accessing potentially stale data from the file server in the main office (which may not be in sync).
The following use-cases will work in this deployment scenario:
In this scenario, the following technologies are supported and will work:
Designing for high availability
DFS Replication in Windows Server 2008 R2 includes the ability to add a failover cluster as a member of a replication group. To do so, refer to the TechNet article ‘Add a Failover Cluster to a Replication Group’. Offline files and Roaming User Profiles can also be configured against a share hosted on a Windows failover cluster.
For the above mentioned deployment scenarios, the branch office file server may be deployed on a failover cluster to increase availability. This ensures that the branch office file server is resilient to hardware and software related outages affecting individual cluster nodes and is able to provide highly available file services to users in the branch office.
Consider the same scenario described above with a few differences. Contoso Corporation has two offices – a main office in New York and a branch office in London. Contoso has deployed a file server in the London branch office. User profiles and redirected home folders are stored on shares exported by that file server. The contents of these shares are replicated to the central hub server in the New York office for centralized backup and data management.
In this scenario, a DFS namespace is configured in order to enable users to be directed to the replica closest to their current location. Therefore, namespace links have multiple targets – the file server in the branch as well as the central file server. Optionally, the namespace may be configured to prefer issuing referrals to shares hosted by the branch office file server by ordering referrals based on target priority.
The replica in the central hub/main site may optionally be configured to be a read-only DFS replicated folder.
What can go wrong?
As a result of the behavior described above the following consequences may be observed:
This deployment variant helps avoid the problems caused by DFS Namespaces failing over due to transient network glitches or when it encounters specific SMB error codes while accessing data. This is because the referral to the share hosted on the central file server is normally disabled.
However, the important thing to note is that the side-effects of replication latencies are still unavoidable. Therefore, if the data on the central file server is stale (i.e. replication has not yet completed), it is possible to encounter the same problems described in the ‘What can go wrong?’ section above. Before, enabling the referral to the central file server, the administrator may need to verify the status of replication to ensure that the adverse effects of data loss or roaming profile corruption are contained.
Both variants of this deployment scenario (2A and 2B) are not supported. The following deployment use-cases will not work:
In this scenario, the following technologies are not supported and will not work as expected:
Great article! Thanks!
Agreed, great article! Although this focuses on "User Profile Data" the same should be said for any multi-user edited documents that don't have a file lock in play (hence Ned's link). Especially those that only write their changes on file exit, like Microsoft Access MDB files for example.
Out of curiosity what is the Microsoft view or suggestion to create server failover for a DR scenario in these cases? For other DFS configurations, we do have a global high/low priority setup for our links. Is there any suggested way to acomplish a DR configuration for DFS-N, without manually needing to change links or use a 3rd party?
I typically find that the planning around DR site 'failover' times are unreasonably aggressive. I.e. customers want to have a DR site come online instantly if their main office servers were all destroyed in a fell swoop.
Except that generally speaking a disaster of that magnitude:
1. Is likely to have destroyed most users entry point into the DR site as well.
2. Is likely to have destroyed most of the users own access points (main corporate network).
The above means that instant failover is unlikely and that the business will not mind a little downtime if the entire datacenter suffered a meteor strike. Which by the way killed all your IT staff, as they are always colocated, so your DR site isn't going to be managed.
So when you gear down to a more common "not disaster" - such as a branch site who's main file server has gone offline due to hardware failure - a few seconds or minutes for an admin to enable a disabled DFS link target or to switch an existing one to another target is usually pretty acceptable.
If data is mostly static and users are never going to have a reason to access a DFSN link target you might accept the aupportability risk and simply have a secondary link online that is extremely high site costed and and unlikely to be hit. But we have a few cases every day of customers accessing the wrong DFSN link targets due to many factors outside their control, such as network conditions and 3rd party WAN appliance issues. You get back into all the risks described in the article at that point.
Thanks for the feedback, and that is pretty much what I assumed was the case. Just wanted to make sure there wasn't some automated "Hey, this File Server is no longer reachable, quick switch any links that point to it" magical hidden feature. :)
In all seriousness some native Microsoft way to script/automate that would be much appreciated. Even if it does still take a few seconds to minutes. There are time admins with proper knowledge and rights aren't always available after all. I'm thinking along the lines of PowerShell cmdlets, but I imagine you guys are already (hopefully) working on such things.
As always, appreciate all the great information you guys put together here!
On second thought... I imagine that "native Microsoft" way will be directed towards SCOM?
That's a very interesting notion. It would be quite scriptable with DFSUTIL and DFSCMD. And you could use SCOM 2007 as well as a trigger. Or some other product, or trigger mechanism.
Very intersting, I shall dwell on this as a potential blog post...
Bummer. Scenario 2A is exactly what I had wanted to do. I'm glad I read this article and I guess I will have to come up with something different...
This is the very sort of the doc I wanted to have, for many years... But I need a solution of a variant of 2A.
I plan to have TWO local fileserver, plus one in central site for tape backup, all configured as DFS target (but the central site one is disabled). We can assume reasonably high-speed connection between the two local fileservers (on the same subnet, same physical switch, 1Gbps connection). I understand we still cannot guranantee 100% sync between those two fileservers, but I guess it should work... Can you comment on this, please?
Also, what happens if I disable DFS target of one of the local fileserver (leaving one local fileserver as Enabled, while the other local one and the central office one are disabled) and manually switch Enable/Disable between the two local servers in case of server failure. Is this supported?
What about if you made the two local servers into one cluster, Ikono? Then you get all the advantages and none of the disadvantages (plus you could save money of disk,as you would need half the space of two full DFSR servers).
For your second question, yes that would work and be supportable.
Thank you, NedPyle. Mainly due to the cost associated to shared disk (iSCSI, etc) I preferred not to go for a clustering. But I am glad to learn the second one is supportable, as I can start testing right now and implement very soon :)
What are your thoughts about this scenario (Variant Scenario 2A)?
• Central Site Server is hosting ONLY the (Not the entire profile) redirected profile subfolders:
○ Documents, Desktop, Favorites, Downloads
• The Redirected folders are located in the users H: drive (Home Path) mapped to - \\Domain\Namespace\Profile
• The data is replicated using DFS Replication over the WAN between a single branch office and Central Site
• Namespace link has multiple targets (Active) - Central Site and Branch Office (Sites and Services is configured)
• Third party solution 'AppSense' is controlling Profile/Settings Roaming across the environment
I understand that in a failover the user may not have the latest copy of their data, but "critical" data is stored in separate department shares which are replicated, (Namespace link has multiple targets) but is only active (enabled) to one.
Since you're using a third party profile tool here, I have no way to comment on this anymore. I can't tell what it might do or want or need. But from I read here, no this is not supported and contradicts our recommendations around multiple access points to the same data by the same user.
The scenario indicates you have multiple links, but only one link is active. That part fits into the prescribed scenarios. However, you indicated you’re using AppSense. I can’t comment on what AppSense does and does not support. Microsoft does not support AppSense.
Thanks for the useful info! Hopefully you can help me with a question...
It makes sense to me that there could be corruption problems with a profile because there are several files that may be modified concurrently and an incomplete replication would leave them in an inconsistent state. However, if CSC is not used, I don't understand why a redirected folder (such as a user's home folder) with multiple targets is at any greater risk than any other replicated folder with multiple targets.
Are you saying that replicated folders with multiple targets are also not supported? I assume that is not true because it would render DFSR useless. Can you clarify as to why folder redirection against a namespace with multiple link targets is different from any other scenario using multiple targets?
The risks with folder redirection against a namespace with multiple targets backing the share are due to replication latencies and due to the switch between targets in case of transient network glitches. It may be disconcerting to a user to find that a document he recently worked on/saved is unavailable when he was redirected to another replica. If you've configure replication schedules (say off hours replication), this may be an expected outcome.
Another problem that can arise is that of replication conflicts since you now have a scenario where the user may be modifying the file on two different replicas. Since DFSR has last-writer-wins replication conflict resolution, you'd find that a previously modified version of the file is moved to the ConflictsAndDeleted directory. It just becomes a lot more difficult for an administrator to support since you have to be ready to restore files from the ConflictAndDeleted folder in case users complain of files going missing.