Many customers have requested instructions on how to enable standby continuous replication to use an alternate network interface. By design standby continuous replication always uses the “public” interface to ship logs and seed the database.
Over the past few weeks we have been working with the Exchange product group on a “supported” method to allow standby continuous replication to use an alternate network interface. This blog will detail how to implement these steps and what effects it has on the overall solution.
First if you are reading this post you should review the replication service deep dive whitepaper located at http://technet.microsoft.com/en-us/library/cc535020.aspx (“White Paper: Continuous Replication Deep Dive"). When reviewing this whitepaper it is important to pay attention to what sources are involved in replication when using standby continuous replication. For example:
Keeping these parameters in mind will help you understand how the following changes will allow for standby continuous replication to use an alternate network interface.
The steps to implement this vary little by operating system. Windows 2008 though does introduce some changes to the way file shares are handled. Please review this blog for information on how share scoping in Windows 2008 effects the operations of the replication service. (http://blogs.technet.com/timmcmic/archive/2008/12/23/exchange-replication-service-exchange-2007-sp1-and-windows-2008-clusters.aspx)
The following instructions are based on Exchange 2007 SP1 with RU7. All customers implementing these instructions are encouraged to do so on Exchange 2007 SP1 RU7.
Replication behavior when using standby continuous replication over an alternate network interface.
When the instructions are implemented as documented, all network traffic from the SCR target to the SCR source is first routed through the private interface. This can be verified with netmon by reviewing SMB (Windows 2003) or SMBv2 (Windows 2008) traffic.
It is important to note that these instructions only effect the LOG SHIPPING functionality of SCR. Other functions such as update-storagegroupcopy will only occur using the public interface. This requires that both the source and target have the ability to communicate over both the public and private interfaces. Planning for network sizing should take into account that re-seeding operations using update-storagegroupcopy must occur over the public interface.
Unlike continuous replication host names in CCR there is no automatic failover between interfaces. Should the private interface serving log shipping be unavailable for any reason, log shipping will fail. With this in mind appropriate monitoring of log copy operations is necessary to ensure replication is functioning. In the event that the network link serving replication is not available, the host file should be removed and replication resumed over the public interface. As mentioned earlier your network design considerations should take into account the need to communicate over both the public and private interfaces as well as the potential need to perform log shipping operations over the public interface.
For the solution to be fully supported network connectivity must be available between the source and target on both the private and public interfaces. All replication operations must be able to function on both interfaces.
When engaging product support services for assistance with replication when these steps are used you may be requested to remove the host file and verify that log shipping works as originally designed with no modifications.
Behavior of commandlets used for implementing / managing standby continuous replication when replication is enabled to use an alternate interface.
Get-storagegroupcopystatus: No issues noted.
Enable-storagegroupcopy: No issues noted.
Disable-storagegroupcopy: No issues noted.
Restore-storagegroupcopy: No issues noted when machines involved are running Exchange 2007 SP1 RU7. Prior to RU7 it may be necessary to use restore-storagegroupcopy –force for the command to complete successfully.
Update-storagegroupcopy: Because update-storagegroupcopy uses online streaming functionality to seed the database to the target the network traffic associated with this occurs over the public interface.
Suspend-storagegroupcopy: No issues noted.
Resume-storagegroupcopy: No issues noted.
Changes to the SCR activation process when replication is enabled to use an alternate interface.
Whether using the database portability method or the single node cluster method after running restore-storagegroupcopy the entries in the host file should be removed or commented out. Once the removal is complete, dns resolver cache should be flushed (ipconfig /flushdns) and a ping from the target machine to it’s own name performed to ensure DNS resolves the correct IP address on the public interface.
When name resolution occurs successfully your move-mailbox –configurationonly or setup.com /recoverCMS can be run to complete the activation process.
Configuring networks and network interfaces to support standby continuous replication using an alternate network interface on Windows 2008.
The first step is to configure the network settings for the network interface that will be used for standby continuous replication. These instructions are performed on both the source and target machines. To configure these settings:
The network configuration process is then completed by updating the network binding orders. To update the network binding orders:
This completes the base networking configuration for standalone machines and clustered nodes.
Additional configuration steps for SCR source servers on Windows 2008.
Additional configuration steps for SCR Targets on Windows 2008.
These instructions apply to both standalone and single node SCR targets based on Windows 2008.
Using notepad, open the hosts files located at c:\Windows\System32\Drivers\Etc
Depending on the source make the following changes:
Here is the output of a sample host file.
# Copyright (c) 1993-2006 Microsoft Corp. # # This is a sample HOSTS file used by Microsoft TCP/IP for Windows. # # This file contains the mappings of IP addresses to host names. Each # entry should be kept on an individual line. The IP address should # be placed in the first column followed by the corresponding host name. # The IP address and the host name should be separated by at least one # space. # # Additionally, comments (such as these) may be inserted on individual # lines or following the machine name denoted by a '#' symbol. # # For example: # # 22.214.171.124 rhino.acme.com # source server # 126.96.36.199 x.acme.com # x client host
127.0.0.1 localhost ::1 localhost
#Exchange 2007 SP1 / Windows 2008 / Standalone Mailbox Server
10.1.1.1 2008-MBX1 10.1.1.1 2008-MBX1.exchange.msft
#Exchange 2007 SP1 / Windows 2008 / Cluster Continuous Replication (CCR)
10.1.1.3 2008-Node1 10.1.1.3 2008-Node1.exchange.msft 10.1.1.4 2008-Node2 10.1.1.4 2008-Node2.exchange.msft 10.1.1.8 2008-Node5 10.1.1.8 2008-Node5.exchange.msft 10.1.1.9 2008-Node6 10.1.1.9 2008-Node6.exchange.msft
#Exchange 2007 SP1 / Windows 2008 / Single Copy Cluster (SCC)
10.1.1.7 2008-MBX4 10.1.1.7 2008-MBX4.exchange.msft
Additionally, the replication service on occasion may have to resort to Netbios name resolution. To ensure that the correct replication network is always returned, edit the LMHOST file and put entries for the netbios name and corresponding IP address.
Using notepad, open the LMhosts files located at c:\Windows\System32\Drivers\Etc
Here is a sample LMHost file.
# Copyright (c) 1993-1999 Microsoft Corp. # # This is a sample LMHOSTS file used by the Microsoft TCP/IP for Windows. # # This file contains the mappings of IP addresses to computernames # (NetBIOS) names. Each entry should be kept on an individual line. # The IP address should be placed in the first column followed by the # corresponding computername. The address and the computername # should be separated by at least one space or tab. The "#" character # is generally used to denote the start of a comment (see the exceptions # below). # # This file is compatible with Microsoft LAN Manager 2.x TCP/IP lmhosts # files and offers the following extensions: # # #PRE # #DOM:<domain> # #INCLUDE <filename> # #BEGIN_ALTERNATE # #END_ALTERNATE # \0xnn (non-printing character support) # # Following any entry in the file with the characters "#PRE" will cause # the entry to be preloaded into the name cache. By default, entries are # not preloaded, but are parsed only after dynamic name resolution fails. # # Following an entry with the "#DOM:<domain>" tag will associate the # entry with the domain specified by <domain>. This affects how the # browser and logon services behave in TCP/IP environments. To preload # the host name associated with #DOM entry, it is necessary to also add a # #PRE to the line. The <domain> is always preloaded although it will not # be shown when the name cache is viewed. # # Specifying "#INCLUDE <filename>" will force the RFC NetBIOS (NBT) # software to seek the specified <filename> and parse it as if it were # local. <filename> is generally a UNC-based name, allowing a # centralized lmhosts file to be maintained on a server. # It is ALWAYS necessary to provide a mapping for the IP address of the # server prior to the #INCLUDE. This mapping must use the #PRE directive. # In addtion the share "public" in the example below must be in the # LanManServer list of "NullSessionShares" in order for client machines to # be able to read the lmhosts file successfully. This key is under # \machine\system\currentcontrolset\services\lanmanserver\parameters\nullsessionshares # in the registry. Simply add "public" to the list found there. # # The #BEGIN_ and #END_ALTERNATE keywords allow multiple #INCLUDE # statements to be grouped together. Any single successful include # will cause the group to succeed. # # Finally, non-printing characters can be embedded in mappings by # first surrounding the NetBIOS name in quotations, then using the # \0xnn notation to specify a hex value for a non-printing character. # # The following example illustrates all of these extensions: # # 188.8.131.52 rhino #PRE #DOM:networking #net group's DC # 184.108.40.206 "appname \0x14" #special app server # 220.127.116.11 popular #PRE #source server # 18.104.22.168 localsrv #PRE #needed for the include # # #BEGIN_ALTERNATE # #INCLUDE \\localsrv\public\lmhosts # #INCLUDE \\rhino\public\lmhosts # #END_ALTERNATE # # In the above example, the "appname" server contains a special # character in its name, the "popular" and "localsrv" server names are # preloaded, and the "rhino" server name is specified so it can be used # to later #INCLUDE a centrally maintained lmhosts file if the "localsrv" # system is unavailable. # # Note that the whole file is parsed including comments on each lookup, # so keeping the number of comments to a minimum will improve performance. # Therefore it is not advisable to simply add lmhosts file entries onto the # end of this file.
10.1.1.3 2008-Node1 10.1.1.4 2008-Node2 10.1.1.8 2008-Node5 10.1.1.9 2008-Node6
This completes the configuration steps for Windows 2008.
Configuring networks and network interfaces to support standby continuous replication using an alternate network interface on Windows 2003.
Additional configuration steps for SCR source servers on Windows 2003.
Additional configuration steps for SCR Targets on Windows 2003.
These instructions apply to both standalone and single node SCR targets based on Windows 2003.
#Exchange 2007 SP1 / Windows 2003 / Standalone Mailbox Server
10.1.1.1 2003-MBX1 10.1.1.1 2003-MBX1.exchange.msft
#Exchange 2007 SP1 / Windows 2003 / Cluster Continuous Replication (CCR)
10.1.1.3 2003-Node1 10.1.1.3 2003-Node1.exchange.msft 10.1.1.4 2003-Node2 10.1.1.4 2003-Node2.exchange.msft
#Exchange 2007 SP1 / Windows 2003 / Single Copy Cluster (SCC)
10.1.1.7 2003-MBX4 10.1.1.7 2003-MBX4.exchange.msft
10.1.1.3 2003-Node1 10.1.1.4 2003-Node2
This completes the configuration steps for Windows 2003.
Updated Sunday, August 9th, 2009 with LMHOST instructions.
Thanks for your post. Interesting stuff.
For our solution, I'm looking into possible ways to get the database seeding performed over the alternate network as well. Since you cannot copy an open database, it would seem that the only way to accomplish this would be to:
- Dismount the database and perform the copy
- Restore a backup to a flat file
- Use some VSS functionality of some kind to snap a copy of the database and copy it over.
But this really calls for implementing this ability in the code ;)