Having spoke with a few customers about whether a local CCR and SCR is the best solution or a stretched CCR across 2 data centres I thought I'd write a post.
There is no right and wrong answer to that question, in typical consulting style 'it depends'. There are various factors to take into consideration when designing the right solution for your customer:-
There are also some factors to think about from the client side, such as DNS refresh. If the customer doesn't have a stretched Virtual LAN (VLAN) between data centres, the cluster will be assigned 1 Network Name resource and 2 IP address resources (since both nodes are separate IP subnets). When the the clustered mailbox server (CMS)fails over the CMS will be assigned a different IP. As part of the cluster configuration in Windows 2008 we recommend the default DNS TTL value for the CMS Network Name resource should be changed.
By default the cluster service has a setting of 20mins, you need to be careful if you change the DNS TTL value through the DNS management console as this will be over written by the cluster settings. So if you want to change the default value from 20mins to our recommended setting of 5 mins you'll need to make the change through cluster administrator.
In order to make this change you'll need Local Admin on each node in the cluster and have full control permission to the cluster.
From a cmd prompt run - cluster.exe res <CMSNetworkNameResource> /priv HostRecordTTL=300 (where 300 is the recommended 5 mins as mentioned above)
Take the cluster offline by running Stop-ClusteredMailboxServer cmdlet in Power Shell
Bring the cluster back online by running Start-ClusteredMailboxServer cmdlet.
I’ve listed below a few risks and how they can be mitigated if you do decide to go with a stretched CCR over CCR + SCR
File Share Witness (FSW) Location
Locate the FSW at an alternate location to provide additional resilience to the cluster
Client cache IP refresh interval
this can configured on the cluster in Windows 2008, or a stretched VLAN can used
Logical corruption of the databases
SCR would provide this feature, but take into consideration your Recovery Time Objective (RTO)
Is the network link between physical locations resilient
Ensure there is alternate routes available
Does the network link between physical locations have low latency (below 50ms)
Test network latency
Network link between between physical locations has enough bandwidth
Test network bandwidth
Backup solution can backup any node in any physical location
ensure your chosen backup solution can back up both locations in the event of a site failure
Manual configuration required to control message routing within a data centre (SubmissionServerOverridelist)
Ensure your operational guides are up to date with how to configure mail routing
Control Client Access within a Datacentre
Querying of AD may take place across the data centre interconnect
Potential loss of email data in the event of a site failure
Email will be stored in the transport dumpster of the HT server in the failed site
Having an in-depth understanding of cluster technology and Window 2008 and Exchange 2007 experience
Written by Daniel Kenyon-Smith