Cloud Insights from Brad Anderson, Corporate Vice President, Windows Server & System Center
There are no shortage of disasters – either natural or man-made – but the cloud has also made mitigating these disasters much more simple and cost effective. These technical advances have moved Disaster Recovery (DR) from being something that was applied to only the most mission critical workloads, to something that can now be applied to any workload at an economical price.
Having a DR strategy is widely considered a base-line requirement, and even a mandatory feature for many workloads in most organizations. Very often it is necessary just to survive – just ask the financial services companies whose offices and datacenters were underwater during Hurricane Sandy but didn’t miss a day of work.
What’s surprising, however, is how many organizations don’t have a DR plan – or they have one that is under-managed, under-powered, or serving only a few workloads. In any case, for any type of organization, the Hybrid Cloud can make a big impact on DR.
Historically, DR solutions have been expensive and complex to deploy. For any enterprise wanting to roll out a reliable DR solution, the costs were prohibitively high. This meant that only the largest organizations could afford DR – and only then if they had a technically sophisticated IT team. The issue of complexity took its toll on many teams who didn’t anticipate the extensive configuration and setup necessary to maintain the system, the complex mechanisms for DR drills, or the challenges with scale.
Many of these same problems have persisted into modern DR solutions, and this leaves many workloads vulnerable. This then exposes businesses to a range of issues like compliance and audit violations – to say nothing of the loss of valuable data.
These modern DR problems are further compounded (the list just keeps going, right?) by the fact that many current DR solutions are built using components from multiple vendors which means that any attempt to provide simplicity through a single pane of management is basically impossible.
In recent years, things have started to get better, however. With adoption of private clouds, DR solutions have started to leverage the benefits of Virtualization (compute and networking) and enabled DR drills against replica copy running on a recovery site in an isolated network. This capability is a boon for IT Admins responsible for DR since they can now perform DR drills periodically without the fear of impacting production.
Even with virtualization, however, many of the other challenges around complexity persist. For example, a complete E2E DR solution should offer DR capabilities beyond just data replication, e.g. bringing reliability to apps on the recovery site with optimized RTO, or ensuring connectivity to clients through required network configuration and compliance needs like reporting. Simple virtualization also doesn’t resolve the need for high-friction maintenance of multiple components. Other concerns include the deployment and ongoing maintenance needed to ensure High Availability of this DR solution (after all, you don’t want your DR solution to be down when you need it the most), and the simple fact of architecture that has most current DR solutions designed under the assumption of an outdated model that assumes there will be a primary + a recovery site (which, for a network of datacenters around the globe is not likely).
I don’t share this detailed overview of the shortcomings of current DR solutions to spread doom and gloom about the safety of your data, but to illustrate where these solutions fall short, and how a Hybrid Cloud effectively and practically address all of them.
Managing your DR strategy in a Hybrid environment has big advantages, and I covered this topic in some detail back during the 2012 R2 Series. The advantages begin with the ability to leverage a world class public cloud like Windows Azure, as well as the use of Hyper-V Recovery Manager (HRM). These two tools constitute the backbone of Microsoft’s end-to-end DR Management – and they have been designed with Hybrid cloud design points that overcome common/chronic DR pitfalls with 7 unique advantages:
Considering how difficult DR has traditionally been, this deserves to go first – and this is a place where the Microsoft Hybrid Cloud really shines. By offering Disaster Recovery as a Service, Microsoft assumes the burden of setup, configuration, and ongoing maintenance to the DR management software. For an organization of any size, this is a huge benefit.
To put this in context, consider the HRM Service. With HRM you can setup DR for few hundred VMs in minutes; the DR solutions from other vendors is going to take days. That amount of time not only reduces your time-to-value, but it also reduces your ongoing maintenance of the DR Management software.
The simplicity of a Hybrid environment also ensures you don’t lose big chunks of hours switching between the DR protocols in a primary site vs. secondary backup sites. The Microsoft Hybrid Cloud has just one console and one interface – no matter what actions need to be performed.
The time savings from the simplicity of the Hybrid DR solution have a real impact on the bottom line in a variety of ways. The HRM Service helps customers reduce the costs associated with DR solutions by leveraging the in-box Windows replication technology, Hyper-V Replica. Having this in-box assures that you don’t need to purchase additional replication licenses, and, since DR management software is provided as SaaS, you don’t need to pay for the additional OS and database licenses that are demanded by typical DR solutions.
The flexibility and ability to personalize a Hybrid environment means that you can leverage this pay-as-you-go model to start small and expand your DR protection needs gradually. Skipping the need for a big DR budget commitment up front can go a long ways to helping your organization get started.
Today our customers expect DR for every application deployed in private cloud. This often leaves enterprise IT struggling to keep up with both expectations and demand since so many of the current DR solutions are not ready to work at cloud scale. HRM leverages Azure’s bottomless capacity to provide a scalable solution which can meet any customer’s need for DR at massive cloud scale.
Due to regulatory or compliance issues, you may not be able to move your application data to a public cloud. This doesn’t mean your data is at risk for a disaster, however. The Hybrid cloud design point of HRM enables you to continue meeting regulatory and compliance norms as your applications are deployed in your own datacenter and the data of your application is replicated and encrypted on your own private network to a recovery site. To be clear: In this scenario none of the application data ever goes to the HRM service in the cloud. Instead, the HRM service only holds the metadata related to DR setup, orchestration, and management. Pretty cool, right?
In traditional DR solutions a big (and regular) area of investment is the high availability of the DR management infrastructure (i.e. avoiding the downtime of your insurance). HRM leverages Azure’s highly available cloud platform and provides protection from both local and regional disasters.
For example, if there are compute or storage failures in one area, the service continues to operate normally because of the high availability provided by Azure’s compute and storage platforms which are replicating multiple copies. The obvious next question is, “What if the whole Azure datacenter experiences an outage?” In these very rare occurrences, Azure provides high availability through its global distribution of datacenters.
Another advantage of having the DR service hosted in a Hybrid Cloud is its ability to act as an independent witness for both your primary and secondary data centers. HRM accomplishes this by closely monitoring your sites and detecting any issues with the primary and recovery sites. This feature is a critical way to prevent a complex problem in the DR world (commonly referred to as “Split Brain”). Instead, when you perform a failover from HRM, it first checks the primary site, and, if the VMs are running on the primary site, it shuts them down before bringing the VMs on at the recovery site.
As noted above, many DR solutions are generally designed assuming a traditional DR model where you have a primary + a recovery site. There is a built-in constraint to this, however, since enterprises today have multiple global data centers that struggle to use a single pane of glass across all these datacenters. The Microsoft Hybrid Cloud not only provides this single pane, but it also enables all possible topologies among them, e.g. One-One, One-Many, Many-Many, etc. This is something that no other DR software can or does provide.
Last, but not the least, by having your DR service deployed in a Hybrid Cloud you can access it from anywhere. Imagine being out of your office when a disaster occurs and struggling to VPN to your corporate network to perform a failover. The accessibility of the Microsoft Hybrid Cloud allows you to access the service from any location and with any device – even from your phone.
There are a lot of great examples of a Hybrid environment making all the difference in a DR scenario, here are a couple to consider:
As noted in last week’s post, any good discussion about the business value of the Hybrid Cloud deserves a look at the technology which makes it so effective in this scenario:
As an SMB who doesn't have a secondary datacenter what I'm really looking for is Windows Azure to fill those shoes via IaaS. I want to be able to replicate my VMs to an Azure datacenter, and then use HVRM to orchestrate with on premise. I can only assume that Windows Azure is headed in this direction, but when?
I'm sure many smaller to medium organizations would like to use HRM with Azure as the their secondary datacenter. It only makes sense. Are we there yet?