In this blog post I will assume there exists a source cluster that consists of a two node Exchange 2007 SP1 Single Copy Cluster (SCC) hosted on either Windows 2003 or Windows 2008.
Standby Continuous Replication (SCR) was designed, in this type of deployment, to have a target that is a single node cluster. Recently I've received several requests on how this could be extended to a two node cluster functioning as the SCR target.
Single copy clusters make having a two node target more complicated because of having to deal with the shared storage. It is a requirement of SCR that the same drive letters / paths used for databases, logs, and system files on the source also exist on the target. Also, we must take into consideration the fact that the storage necessary for replication can only be owned by a single node (shared nothing cluster model), and therefore only one node of the target cluster can be subscribed as the SCR target.
If you desire to have a two node SCR target, consider making the following configuration changes to assist in ensuring that the physical disk resources are owned on the correct node.
Windows cluster allows administrators to specify, on the properties of clustered groups, a list of preferred owners. The preferred owners list on an Exchange cluster is generally cosmetic. When preferred owners is combined with a Failback Policy, the settings become more then cosmetic. A preferred owners group allows the administrator to establish the list of nodes, in order, that they prefer the group be hosted on when nodes are available. When combined with a failback policy, the preferred owners list tells the cluster where and when to move the group automatically when specific nodes are available. Let's look at a few examples of this as it applies to our SCR target. The preferred owners list and failback policy will be invoked anytime cluster membership also changes, for example, when rebooting a node that is a member of the cluster.
Example #1:
I have a two node SCR target with a group configured to hold my physical disk resources. I have set a preferred owners list of NodeA then NodeB and a failback policy of immediate. The group is currently owned on NodeA. At patch management time I apply the necessary hotfixes to NodeB and reboot the server. When NodeB has successfully rejoined the cluster, I then apply the patches to NodeA and reboot. The disk group automatically moves from NodeA to NodeB. When NodeA successfully rejoins the cluster, the disk group automatically moves back to NodeA. Replication can now successfully resume since the underlying storage necessary for replication is present on NodeA, and NodeA is subscribed as the SCR target. In this instance cluster membership changed during the reboot causing the cluster to evaluate the preferred owners list and failback policy and take actions as defined.
Example #2:
I have a two node SCR target with a group configured to hold my physical disk resources. I have set a preferred owners list of NodeA then NodeB and a failback policy of immediate. The group is currently owned on NodeA. NodeA experiences a blue screen condition due to a faulty storage driver. The disk group automatically moves from NodeA to NodeB. When NodeA automatically reboots and successfully rejoins the cluster, the disk group automatically moves back to NodeA. Replication can now successfully resume since the underlying storage necessary for replication is present on NodeA, and NodeA is subscribed as the SCR target. In this instance cluster membership changed during the reboot causing the cluster to evaluate the preferred owners list and failback policy and take actions as defined.
Example #3
I have a two node SCR target with a group configured to hold my physical disk resources. I have set a preferred owners list of NodeA then NodeB and a failback policy of immediate. At patch management time I apply the necessary hotfixes to NodeB and reboot the server. When NodeB has successfully rejoined the cluster, I launch failover cluster management and manually move the disk group from NodeA to NodeB. I then apply the patches to NodeA and reboot the server. When NodeA successfully rejoins the cluster, the disk group automatically moves back to NodeA. Replication can now successfully resume since the underlying storage necessary for replication is present on NodeA, and NodeA is subscribed as the SCR target. In this instance cluster membership changed during the reboot causing the cluster to evaluate the preferred owners list and failback policy and take actions as defined.
Example #4
I have a two node SCR target with a group configured to hold my physical disk resources. I have set a preferred owners list of NodeA then NodeB and a failback policy of immediate. An administrator, using failover cluster management, moves the disk group from NodeA to NodeB. The group is not moved back. Replication will enter a failed state for all instances since the storage necessary for replication to function is no longer present on the node subscribed to SCR. Alerting informs the administrator there is an issue. It is determined that the disk group is owned on the wrong node, and is manually moved back to NodeA. Soon after replication successfully resumes since the underlying storage necessary for replication is present on NodeA, and NodeA is subscribed as the SCR target. In this instance cluster membership did NOT change, so the preferred owners list and failback policy was not applied.
Establishing the disk group, Preferred Owner, and Failback Policy in Windows 2003
Use the following steps to establish the disk group, preferred owners list, and a failback policy in Windows 2003.
The configuration of preferred owners and a failback policy can be performed with command line.
To set the list of preferred owners and configure failback:
Examples of these commands:
Establishing the disk group, Preferred Owner, and Failback Policy in Windows 2008
Consider reviewing the following references for more information.
http://support.microsoft.com/kb/197047
http://support.microsoft.com/kb/299631
http://support.microsoft.com/kb/823955