Microsoft Azure Site Recovery: Your DR Site in Microsoft Azure

Microsoft Azure Site Recovery: Your DR Site in Microsoft Azure

  • Comments 3
  • Likes
One of the biggest reveals during the TechEd 2014 North America keynote was the Preview of Disaster Recovery to Azure using Azure Site Recovery. Azure Site Recovery (ASR) is the new name for Hyper-V Recovery Manager (HVRM). HVRM was the first use of the Azure public cloud for off-premise automation controlling on-premise private cloud fabric. HVRM had a lot of fans for its simplification of a complex and expensive process (DR site management and failover) but fewer adopters since the solution required customers to already have invested in a physical or partner-hosted DR site running a Microsoft private cloud.
HVRM required Hyper-V servers managed by System Center Virtual Machine Manager (SCVMM) at both sides of the replication channel, that is, in both the primary and the DR datacenters. This capability created two scenarios: (1) organizations could provide their own on-premises private cloud to use as the DR site or they could (2) partner with a service provider to run the Hyper-V replicas in a hosted private cloud. ASR retains that functionality and adds a major new feature: (3) The DR side of the replication channel can be in Microsoft Azure.
A top customer ‘ask’ for new capabilities in Azure, this scenario requires none of the capital expenditure (CapEx) of building the secondary site for the DR copies of the VMs.  The recovery VMs can exist in the Azure cloud and not require a second customer or partner datacenter. Figure 1 is a Microsoft slide from TechEd 2014 showing the DR to Azure scenario:
Fig-1-RecoverToMicrosoftAzureSlide1

Figure 1 - Azure Site Recovery (ASR) with DR site in Microsoft Azure

ASR manages the orchestration of all three scenarios—customer DR site, partner DR site, Azure DR site—in the same manner and from the same interface in the Azure portal. You can now reduce the footprint of your datacenters by looking to Azure to host the DR side of your protected private clouds.
Upgrading from HVRM to ASR
The Azure subscription featured in this article is already configured to use a backup vault in Azure and HVRM for the management of the replication and DR activation of two VMs in a private cloud. The VMs in the ‘protected cloud’ reside in a primary customer premises datacenter, and are replicated to a recovery cloud in a second datacenter, a private cloud hosted by a service provider, the Disaster Recovery (DR) site.
We are going to implement the new feature introduced by ASR, which is, to locate some VM replicas in Microsoft Azure. To do this we need to upgrade the HVRM agents to newer ASR-compatible agents. When we navigate in the Azure portal to the Dashboard of the Azure Site Recovery Vault, we are prompted to download and install agents for both the SCVMM servers and for the Hyper-V hosts and seen in Figure 2.
Fig-2-Updated-ASR-Provider

Figure 2 - New Hyper-V Agents and updated SCVMM agents needed to use ASR.

New with ASR is the Hyper-V agent. In the former HVRM model, since both ends of the replica channel were Hyper-V, the built-in Hyper-V Replica function of Windows Server 2012 was sufficient for the solution to work and no agent was required for the Hyper-V hosts. The ability of a Hyper-V host to replicate directly with Microsoft Azure is not part of the base OS or the SCVMM agent, so a new agent is introduced.
There is also an updated SVCMM agent (or ‘provider’) that is required for ASR to replicate on-premise VMs with Azure. Download the new SCVMM agent package (VMMASRProvider_x64) from the Azure portal and upgrade the SCVMM agent (previously VMMHRMProvider_x64) on your SCVMM computers.

·        The first page of the upgrade wizard (Figure 3) includes the note that if you are using Azure for the DR site, the latest release of SCVMM 2012 R2 is required on premise; this is, if the DR site is hosted by SCVMM, ASR can also work with the previous release, SCVMM 2012 SP1.

Fig-3-Upgrade-Dialog

Figure 3 - Upgrading Hyper-V Recovery Manager (HVRM) to Azure Site Recovery (ASR) on the SCVMM server.

Then download the new Hyper-V agent package (MARSAgentInstaller) and run it on your Hyper-V hosts. This will install the Microsoft Azure Recovery Services (MARS) Agent on the hosts. Windows Identity Foundation, .NET Framework 4 and Windows PowerShell are verified or installed during the agent setup. As seen in Figure 4, the MARS agent requires a cache location that must have free space equivalent to 10% of the data that will be backed up.
Fig-4-MARS-Agent-Install

Figure 4 - Specify a cache location during installation to the MARS agent on Hyper-V hosts.

Starting to use Azure as a DR Target
You start using Azure as the DR site in a Hyper-V replica scenario by creating a new recovery plan with Microsoft Azure as the target. You won’t be able to directly migrate the computers in an existing HVRM-protected cloud as long as their source cloud remains targeted to an SCVMM-based private cloud at the DR site. A given SCVMM private cloud can only replicate to one or the other of SCVMM or Azure targets. You can’t mix VM replication targets in the same cloud, some replicating to Azure and others to a private cloud.
To retarget VM replicas from SCVMM at the DR site to Azure as the DR site, create a new cloud with Microsoft Azure as the replication target and move VMs from the previous cloud to the new one. Alternatively, you could disable and remove HVRM from the source cloud, then redeploy fresh on ASR using the original cloud. Just remember a source cloud can have only one target at a time. Also, you must have at least one Azure Virtual Network configured in your Azure subscription, available to be the target network for replicated VMs in Azure when a failover action occurs.
Here are the steps to create a DR site in Azure (after installing the VMM agent updates and Hyper-V MARS agents):

1.      On the SCVMM server of the source network, create a cloud and configure the Properties -> General page of the cloud to Send configuration data about this cloud to the Windows Azure Hyper-V Recovery Manager.

2.      In the Azure portal, navigate to Recovery Services -> <Recovery Vault name> -> Protected Items and wait for the cloud exposed in step 1 to appear.

3.      Click on the cloud name and push the Configure Protection Settings button.

4.      Select Microsoft Azure from the Target drop-down list. The default settings for replication are shown in Figure 5. Push the Save button at the bottom center of the page when confirmed.

Fig-5-Configure-Azure-Target

Figure 5 - Enable protection of a source cloud with Microsoft Azure as the target.

5.      Once you have enabled protection of the source cloud with Microsoft Azure as the target, you can add VMs to the protection plan.

6.      To add a VM Replica to Azure, click on the Enable Protection button from the Virtual Machines page of the protected cloud. You will see a list of eligible VMs for replica creation. Click the checkbox next to the computer to begin replication. Some notes to be aware of in the preview release of ASR:

 

o   Only Hyper-V “Generation 1” VMs can be replicated. (Source VMs can use a VHD or VHDX-format virtual hard disk, in Azure the replica VMs will have VHD-format.)

o   VMs must have exactly one disk specified as the operating system disk.

 

7.      You can watch the status of the replication from the Jobs tab of the Recovery Vault.

8.      Once replication is started, you need to next map the on-premises network of each VM to an Azure virtual network:

·        Navigate in the Azure portal to Recovery Services -> <Recovery Vault name> -> Resources -> Networks.

·        Select the Source and Target locations, specifying Microsoft Azure for the target.

·        Then select the source network in the left column and push the Map button.

·        Select the Azure Virtual Network in your subscription that matches where you want the recovery VM connected as shown in Figure 6.

 

Fig-6-Map-Networks

Figure 6 - Map target virtual networks in Azure to source private cloud networks.

9.      Replica creation may take some hours as the VM image is uploaded to Microsoft Azure.

·        When the Replication Status for the VM is Protected, view the VMs configuration in Azure to confirm the machine size for the target Azure VM has been correctly selected.

·        Find the VM configuration at Recovery Services -> <Recovery Vault name> -> Protected Items-> -> <Source Cloud name> -> Virtual Machines -> <VM name> -> Source and target properties as shown in Figure 7.

Fig-7-Source-and-Target-Properties

Figure 7 - Modify the machine size of the target VM in Azure if necessary.

10.   When replication to Azure is enabled, the source VM’s Hyper-V guest computer properties will show Hyper-V Replica (HVR) enabled with target Microsoft Azure as shown in Figure 8. Also the source VM’s SCVMM hardware configuration will automatically be updated to Enable Hyper-V Recovery Manager protection for this virtual machine and shown in Figure 9.

Fig-8-HyperV-Replica-in-Azure

Figure 8 - Hyper-V Replication Health page for a VM protected to Microsoft Azure.

 

Fig-9-Enable-from-SCVMM

Figure 9 - SCVMM computer hardware configuration for VM replicated to Microsoft Azure.

11.   The last step before you can perform ‘one button failover to Azure’ is to create a Recovery plan that includes the new VM replica(s) in Azure. 

·        Navigate to Recovery Services -> <Recovery Vault name> -> Recovery Plans and push the Create button.

·        Specify a name for the recovery plan, and select the SCVMM server for the source cloud and select Microsoft Azure as the target (shown in Figure 10).

Fig-10-Recovery-Plan-Name-and-Target

Figure 10 - Name your recovery plan, select the source SCVMM server, and Microsoft Azure as target.

·        VMs in the protected cloud that can be replicated to Azure will be listed as shown in Figure 11. Select the VMs to be replicated to Azure and push the checkmark button.

Fig-11-Create-Recovery-Plan-Select-VM

Figure 11 - Select the VMs in the protected cloud to replicate to Azure.

Test Failover

12.   One last major task before you rest easy with your VM replicas in Azure is to test the DR failover process. Microsoft makes this normally complex task quite simple with a single button as shown in Figure 12.

·        The test will be conducted without impacting the production VM and replication process.

·        A temporary instance of an Azure Service will be created for the test.

·        Locate and select the recovery plan that uses Microsoft Azure as the target and press the Test Failover button.

Fig-12-Push-Test-Button

Figure 12 - Launch a Test Failover job from the Recovery Plans page of the Azure portal.

13.   After launching a test failover job, you do need to make one more decision: whether to connect the VM replica running the test to a network in Azure or not. See in Figure 13 that you are prompted to create a new virtual network if necessary: You can’t define the production network (the one you selected in the “map” Step 8 of this article) as the network for the test.

Fig-13-Confirm-Test-Networks

Figure 13 - Optionally specify an Azure Virtual Network for connection of test failover VMs.

14.   The test job will actually spin up VMs in Azure with the computer names of the source VMs, you will see the VMs appear in the Azure portal Virtual Machines page. When your test status of a VM is Waiting for Action, this means the test VM is running in Azure and ready for you to inspect.

·        Even if you didn’t specify a network for the test VMs, recovery test VMs are connected to the Internet and a random Azure private network.

·        If desired, open Endpoints in the recovery VMs such as RDP in order to login and verify the test recovery VM started successfully in Azure.

·        When satisfied the test recovery VMs are healthy, push the Compete Test button as shown in Figure 14. This will delete the test environment and reset to the configuration before the test. Test VMs will disappear from your Azure portal Virtual Machine page.

Fig-14-Complete-Testing

Figure 14 - Push the Complete Test button in the Recovery Plan Job view when satisfied with testing.

Planned Failover

15.   It is not difficult to perform an actual failover and failback to verify those processes, this is known as a Planned Failover. You should perform a planned failover for validation or training purposes during your evaluation of ASR.

·        The planned failover will impact the production VMs and will reverse the replication process, making the former target VMs in Azure the source VMs for the eventual failback job.

·        Locate and select the recovery plan that uses Microsoft Azure as the target and press the Failover button, then click the Planned Failover option.

·        At the Confirm Failover page click the checkmark button.

 

16.   When the failover is completed you will have new VMs in Azure that were replicas of the VMs in the source (production) site. These VMs will appear in your Azure portal with Virtual Machine names that match their source computer names.

·        VMs will be connected to the Azure Virtual Network you specified during setup of the recovery plan.

·        Open Endpoints in the VMs such as RDP and other necessary inbound services.

·        VMs in Azure will register their DNS with your Active Directory Domain Controller and their private DNS address is updated across the enterprise with the IP address of the VM in the Azure Virtual Network. Figure 15 shows the ‘acid test’ in seamless DR site failover. In this command line screenshot, a network client does a ping to the DNS name of a protected VM before and after failover to Azure (and DNS client TTL timeout, usually 15-60 minutes).

Fig-15-Failover

17.    Figure 15 - Pinging a server that fails over Azure: Before and After.

·        After failover, normal Active Directory (AD) dynamic DNS client name registration processes will direct traffic to the private DNS name of the VM to the new production instance in Azure. The only meaningful difference is that network access is Internet-speed rather than LAN speed. This network traffic follows the pre-established routing scheme for your enterprise LAN/WAN to your Azure virtual network (VNET).

 

18.   There is another action that must be taken to complete the failover process, this is the Commit job. After the failover is completed, virtual machines are in a state of Commit Pending. Click Commit to initiate the commit process. This selects the VMs in the commit pending state and completes the commit action for them.

Failback

19.   To failback to the original production site after committing a failover, you actually perform a Planned Failover again.

·        Navigate to Recovery Services -> <Recovery Vault name> -> Recovery Plans -> <Recovery Plan name> -> Virtual Machines and push the Failover button, then selecting the Planned Failover option.

·        Observe the confirmation page seen in Figure 16. Select to synchronize data before or during failover and click the checkmark button.

Fig-16-Failback 

Figure 16 - Failback to the original source cloud after a failover to Microsoft Azure.

20.   If you selected the “sync before failover” option (the default), there is another action that must be taken to complete the failback process, this is the Complete Failover job. After the initial synchronization is completed, virtual machines are in a state of Data Synchronization Completed. Click Complete Failover to finish the failback process as shown in Figure 17.

Fig-17-Failback-Waiting-For-User-Input

Figure 17 - A failback job with advance synchronization: Press Complete Failover button when ready to failback.

21.   After resuming the planned failover (to failback) to the production datacenter on-premises, watch the progress of the tasks at the Jobs tab of the Azure Site Recovery Vault service in your Azure portal as seen in Figure 18.

·        The Hyper-V Replica status of the on-premises VM will be Failback in Progress during the final failback tasks, which can take several hours to re-sync with the on-premises VM.

·        You can monitor the progress of the data transfer (Failback Replication percent complete) from the on-premises Hyper-V console.

 

 Fig-18-Job-Steps-in-Failback

Figure 18 - Watching the final tasks in the failback to production datacenter job as the on-premises VM is started.

22.   Failback is almost complete when the on-premises VM is started and the Azure replica VM is stopped. At this stage, you will find a final Commit action waiting for you in the Protected Cloud -> Virtual Machines page of your Azure portal as shown in Figure 19.

Fig-19-Commit-Final-Failback

Figure 19 - Complete the Failback process by directing a final Commit action.

23.   The Failback procedure is finalized by the Commit job: this cleans up the failover resources in Azure such as deleting the recovery VMs and restoring the normal Hyper-V Replica direction and process. Delete any empty or temporary blob containers created during the failovers from the Storage page of your Azure portal (Storage -> <Storage Account name> -> Containers).

 

After completing the steps described in this article, you will have experience with the main features of the Azure Site Recovery service when using Microsoft Azure as the DR site. Aside from the obvious CapEx win of using public cloud compared to private cloud for hosting DR copes of production VMs, risk in DR site activation and failback to production is greatly reduced by locating the DR infrastructure in Azure.

 

The chance for errors in the riskiest parts of the DR challenge-- orchestration to perform failover/test failover/failback jobs and the readiness and health of the virtualization plant at the recovery site--is essentially eliminated by Azure Site Recovery.  So there is a big reliability improvement, as well as a big cost savings element, that make DR failover to Azure a universal win. Meet the new de facto standard in enterprise DR architecture.

About John Joyner

John Joyner is a product development director and senior architect for a managed services provider. A Cloud and Datacenter Management MVP, John is co-author of the four-book series Operations Manager: Unleashed. John is happy to answer any questions around SCOM and can be reached at any of the links below.

You can reach John and find out more about him here:

Blog: http://opsmgrunleashed.wordpress.com/

Twitter: @john_joyner

MVP Profile: http://mvp.microsoft.com/en-us/MVP/John%20Joyner-4012882

/Enjoy!

Christian Booth (ChBooth) | Sr. Program Manager | System Center

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment