SQL Server AlwaysOn Availability Groups – New combined HA/DR solution for SQL Server and the path to your own SQL Server Cloud

SQL Server AlwaysOn Availability Groups is a new combined HA/DR solution introduced with SQL Server Codename “Denali”. AlwaysOn Availability Groups can help you to maximize the availability to a set of user-databases. The concept also helps you to separate your read-only workload to a secondary node and therefore helps you to make use of hardware which was passive before.

Why a new concept for HA/DR? We already have Failover Clustering, Database Mirroring and LogShipping…

Yes, true… but with every HA or DR solution available before you had some limitations:

Failover Clustering:

  • Shared storage needed
  • No active secondaries available
  • If DR is needed, you have to involve synchronous storage replication mechanisms or combine Failover Clustering with Database Mirroring or LogShipping

Database Mirroring:

  • Application needs SNAC or has to use FailoverPartner parameter in connection string for automatic failover
  • No virtual network name to connect (which would be transparent for the application in case of a failover)
  • Only 1 active secondary in combination of database snapshots on the mirror

LogShipping

  • No automatic failover
  • Secondaries offline while applying transactionlog backups

The list above is not complete, just some points which came to my mind while writing this blog post. Certainly, every solution has its advantages and we will look later on, where “classic” solutions may be more suitable than AlwaysOn Availability Groups.

So what do AlwaysOn Availability Groups deliver me to achieve the 9s I need?

Feature list of AlwaysOn Availability Groups:

  • Multi database failover
  • Multiple secondaries (A secondary is conceptually similar to a Mirror in Database Mirroring)
    • Max of 4 secondaries
  • Synchronous and asynchronous data movement
    • Support for 2 synchronous secondaries for additional data protection
  • Built in compression and encryption of transport
  • Automatic, Manual and Forced Failover
  • Flexible failover policy
  • Automatic Page Repair
  • Active Secondary
    • Readable secondary
    • Secondary backup
  • Automatic application redirection using virtual name
  • Configuration Wizard for simplified deployment
  • AlwaysOn Dashboard
  • System Center Integration through new Management Pack (by the way, Database Mirroring and Replication Monitoring for SQL Server 2008 R2 will also be available J)
  • Automation using Powershell
  • Rich diagnostic infrastructure
    • DMV, Perfmon Counter, Xevents etc…

How does the Architecture look like?

Technically you could state that AlwaysOn Availability Groups are like “enhanced database mirroring” using Windows Server Failover Clustering (WSFC) technology for health detection and failover of resources. But there are lots of improvements in terms of performance for example compared to database mirroring! Each availability group defines a set of two or more failover partners known as availability replicas. Availability replicas are components of the availability group that are hosted by separate instances of SQL Server residing on different nodes of a WSFC failover cluster. Each of these SQL Server instances is either a SQL Server failover cluster instance (FCI) or stand-alone instance on which you have enabled AlwaysOn Availability Groups. Each availability replica hosts a copy of the availability databases in the availability group. You don’t need any shared storage components if you are not using a SQL Server failover cluster instance.

The following diagram shows a possible configuration where the customer runs his workload in its datacenter in St. Gallen. He configures a local HA in form of a synchronous Availability Group replica in the same datacenter. The automatic failover will happen between these two replicas. There is another synchronous replica configured for Sion, where all the reports are generated real time (more on this later). The disaster recovery site in Geneva will be served by an asynchronous Replica due to network latency.

awon1

All this is configured on a single Windows Server Failover Cluster in multiple subnets. No VLANs are necessary, there is no SAN infrastructure in place. In this configuration the customer achieves a HA/DR configuration including read-only scale-out of his database-application with one single feature delivered, configured and managed by SQL Server.

Your application connects to a virtual network name and just have to make sure that it reconnects after a failover.

As stated above it’s also possible to configure AlwaysOn Availability Groups on top of a “classical” SQL Server Failover Cluster Instance. This might be necessary if the application runs distributed transactions on several databases within an instance, system databases need to be highly available or in a migration scenario. Be aware that in this case the automatic failover is managed by the SQL Server Failover Cluster instance and cannot be configured to another AlwaysOn Replica.

Sounds like a quite golden solution, but where doesn’t it fit?

Ow yes indeed, I really like the new feature and it opens lots of possibilities. But for example in all cases where you aren’t able to build a Windows Server Failover Cluster, because you don’t have an Active Directory for example, you cannot use the new feature. So Database Mirroring over two networking zones (possibly with encryption and authentication with certificates) is the solution to go in this case.

If you like to plan delayed restores for mitigating data errors you would still use LogShipping, or for example configure log-shipping on top of an Availability Group to an additional server.

In the above scenario you have 4 times data, aren’t you aware of the data explosion everywhere?

We certainly are! And please keep in mind that with a solution like that, you do not need to mirror on a storage level anymore, where for example tempdb is mirrored also and all changes on the data/log/tempdb files are transferred via the network connection to the other datacenter… in this case it’s done in a compressed way and only what needs to be transferred via the net, goes over… so no tempdb traffic, no changes on the datafile, just the “logstream” which gets applied on the secondary.

And there is an additional benefit with Availability Groups: When you create an availability group, you can configure read-only access to the availability replica that is running under the secondary role. Read-only access to the databases in the secondary replica allows you to offload your read-only workloads from your primary replica, and optimizes resources on your primary replica for your mission critical workloads. You can actively use the servers with the replicas, that’s great. You can also offload Integrity Checks or Backups!

The data in the secondary replicas is near real time. In many circumstances, data latency between the primary and secondary databases should be within seconds.

I’ll plan a separate blog post on this entire topic.

How can you test all this out?

Go to https://www.microsoft.com/denali and download CTP 3 and please file your suggestions and findings on https://connect.microsoft.com! We do not just read your feedback, we actively improve SQL Server with it…

On https://msdn.microsoft.com/en-us/library/ff878487(v=SQL.110).aspx you find every prerequisites for setting up an AlwaysOn Availability Group.

I list here the basic steps:

1. Setup Windows Server Failover Cluster Nodes (Windows Enterprise Edition, R2 and SP1 recommended to have necessary hotfixes installed)

2. Build a Windows Server Failover Cluster, Configure Quorum: Node Majority for odd number of nodes, Node Majority + File Share Witness with even number of nodes.
Please also read following KB article: https://support.microsoft.com/kb/2494036/en-us it may make sense to adjust votes for nodes in a cluster, especially for an asynchronous disaster recovery replica. Probably you don’t want to get an automatic failover in the production site just because the DR site had an issue

3. Install SQL Server Standalone Instances on every Node (just as if it would be a standalone server)
I recommend to configure and setup them according to your defined corporate standards, so every node looks the same. This makes it easier to manage and to setup Availability Groups.
SQL Server Policies and Multi Server Management help to achieve that every configuration item on every node is the same J

4. Enable AlwaysOn Capabilities in the SQL Server Configuration Manager on each node (can be done with PowerShell: https://msdn.microsoft.com/en-us/library/ff878259(v=SQL.110).aspx). An instance restart is necessary

5. By default the CTP3 has some limitations for the number of replicas and synchronous replication capabilities. To enable the full feature set as listed above you need to configure Traceflag T9532 on the new startup tab in the SQL Server Configuration Manager. A restart of the instances is necessary again.

Now the instances are ready to host AlwaysOn Availability Groups…

I really recommend you to read the Books Online topics around AlwaysOn Availability groups and start here: https://msdn.microsoft.com/en-us/library/gg509118(v=SQL.110).aspx. It is already great documented and it helps you to understand the wizards and scripts.

SQL Server Denali – Cloud on your terms

This blog post is just the beginning of a series where I will dig deeper in to configuration and operation of AlwaysOn and all its benefits. To finish this post I’d like to have a look on this feature from a cloud perspective. Couldn’t AlwaysOn Availability Groups be a perfect platform for your SQL Server Private Cloud Service? You setup multiple (up to 16) standalone SQL Servers, combine them to a Windows Server Failover Cluster and create databases with scripts where automatically an Availability Group gets configured, a virtual network name will be created for every application and the databases are highly available by default including disaster recovery if needed. Depending on the load you can move databases and the workload of groups of databases from server to server to balance your environment. Resource Governor would help to guarantee or limit resources depending on your SLAs and also help to invoice your internal customers based on usage… Sounds like cloud… stay tuned.