Rather than write endless amounts here I’m just going to describe some useful pointers that you can reference. The first reference is to failover clusters, I mention this one first as it dives head on into that word ‘Cluster’. This word is a pain as it means different things to different people and if its not properly qualified it just causes confusion. Unfortunately in Microsoft we often just use the word ‘clustering’ – when we do we usually mean failover clustering.
However, in the Reporting Services documentation (enterprise deployments), I found my first instance of Microsoft referring to Network Load Balanced (NLB) Clusters with the word cluster, and boy has it caused confusion – Reporting Services is not failover cluster aware!
So apart from ‘failover’ and ‘NLB’ clusters there are also ‘scale out’ clusters, sometimes called (by Microsoft) ‘federated servers’. Oracle also has their Real Application Clusters (RAC) which can provide both ‘scale out’ and ‘failover’ capabilities.
Published: June 18, 2002 | Updated: October 14, 2004
Abstract: Users demand not only performance and scalability, but availability of the applications they use. Achieving high availability requires a whole solution, part of which includes the SQL Server 2000 Server failover clustering. This white paper will explain what failover clustering is: how it works, considerations to take into account when designing your solution, how to implement and administer your solution, and how to troubleshoot a failover cluster. Also included are some useful worksheets and checklists to assist you during the installation, as well as links to additional information where necessary
The next reference is the most comprehensive discussion on High Availability (HA) I know – unfortunately it’s a little out of date but it is still valuable. It also deals with stretch clusters (sometimes referred to a geo clusters) but I consider this to fall more into the camp of Disaster Recovery (DR) – I’ll probably blog on DR another day.
This reference deals with both planning and deploying. Helpfully it discusses what high availability really means, because this can lead to mine field of arguments like ‘do you really need 5 9s? I find it useful to show customers as it put a number of technologies, like log shipping and NLB clusters into one document.
Updated: November 19, 2002
The Patterns & Practices team has decided to archive this content to streamline content offerings and keep focused on the newest, most relevant content. However, we will continue to make this content available because it is still of interest to some users. We offer this content as-is, without warranty that it is still technically accurate as some of the material may be outdated. Note that the content may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.
The following reference is actually a chapter pulled out of the SQL Server High Availability Series referenced above. I highlight it because it deals with backup and recovery strategies going all the way through to the two most advanced topics of ‘Split-mirror’ and ‘Copy-on-write’ snapshots.
Updated: November 14, 2002
Despite the best-laid plans, you cannot prevent every disaster that might befall a data center. You must therefore ensure that you can recover the data center from backups after a catastrophic failure, such as a user error or a hardware-induced database corruption. This chapter discusses using database backups to ensure a complete recovery of the data center in the event of a disaster. It focuses on techniques to optimize the Microsoft SQL Server 2000 recovery process and reduce the total recovery time to meet data center availability goals. A basic understanding of SQL Server backup and recovery procedures is assumed. For more information about SQL Server backup and recovery procedures, see "Backing Up and Restoring Databases" on the MSDN Web site at http://msdn.microsoft.com/library/en-us/adminsql/ad_bkprst_9zcj.asp.
If you feel the content of this blog misses something out please let me know.
I'm trying to find some detail on Patch management/deployment on a 4 SQL node cluster.
Can you guide me to anything on this please?