VIRTUALBOY BLOG

Hyper-V Clustering Limits Increased

Hyper-V Clustering Limits Increased

  • Comments 0

As hardware increases in scale, and new capabilities, such as Dynamic Memory, are introduced into Hyper-V R2 SP1, more and more customers are going to start to encroach on the supported limits of Hyper-V cluster nodes.  As of May 2010, those supported limits stood at 64 VMs per cluster node, up to a total of 15+1 nodes, giving a total of 960 VMs.  This contrasts considerably with the 384 VMs per non-clustered host, yet will still be more than enough headroom for most customers, however, in a recent announcement at TechEd 2010, we’ve decided to increase the limits on the cluster nodes.  The increase is actually pretty considerable too, helping customers to scale to much greater levels, especially on smaller clusters, assuming they have resource in their underlying hardware!

So, in a nutshell, we now support 1000 VMs per cluster, providing you don’t exceed the 384 VMs per node limit, which which will still be enforced. In tabular form:

Number of Nodes in Cluster

Max Number of VMs per Node

Max # VMs in Cluster

2 Nodes (1 active + 1 failover)

384

384

3 Nodes (2 active + 1 failover)

384

768

4 Nodes (3 active + 1 failover)

333

1000

5 Nodes (4 active + 1 failover)

250

1000

6 Nodes (5 active + 1 failover)

200

1000

7 Nodes (6 active + 1 failover)

166

1000

8 Nodes (7 active + 1 failover)

142

1000

9 Nodes (8 active + 1 failover)

125

1000

10 Nodes (9 active + 1 failover)

111

1000

11 Nodes (10 active + 1 failover)

100

1000

12 Nodes (11 active + 1 failover)

90

1000

13 Nodes (12 active + 1 failover)

83

1000

14 Nodes (13 active + 1 failover)

76

1000

15 Nodes (14 active + 1 failover)

71

1000

16 Nodes (15 active + 1 failover)

66

1000

and from TechNet:

Component

Maximum

Notes

Nodes per cluster

16

Consider the number of nodes you want to reserve for failover, as well as maintenance tasks such as applying updates. We recommend that you plan for enough resources to allow for 1 node to be reserved for failover, which means it remains idle until another node is failed over to it. (This is sometimes referred to as a passive node.) You can increase this number if you want to reserve additional nodes. There is no recommended ratio or multiplier of reserved nodes to active nodes; the only specific requirement is that the total number of nodes in a cluster cannot exceed the maximum of 16.

     

Running virtual machines per cluster and per node

1,000 per cluster, with a maximum of 384 on any one node

Several factors can affect the real number of virtual machines that can be run at the same time on one node, such as:

· Amount of physical memory being used by each virtual machine.

· Networking and storage bandwidth.

· Number of disk spindles, which affects disk I/O performance.

Obviously many of you will look at that and say “We don’t leave 1 node free for ‘failover’'” whereas some of you will always do this, to ensure there’s enough resource for failing over VMs in the event of an issue.  Now, I’m not going to say that you absolutely have to have a +1 node, but it is best practice nonetheless and something that should be considered in mission-critical deployments.  So, looking at the table, even on a 4 node cluster (3+1), you can hit the big 1000, which shows huge scalability and consolidation.  If you went from 1000 servers, down to 4, that would be a % saving of over 99% (assuming my aging maths is correct there).  I’m going to say something now, and you should listen carefully.

Just because you can, doesn’t mean you should.

If you’re going to run that many eggs, on so few baskets, you’re going to have to ensure that the underlying infrastructure is rock solid and extremely well capacity planned/architected.  From networking requirements (a LOT of NICs would be needed in those hosts I imagine!) through to storage (how much I/O!?), and memory (DM will help!) through to CPU (8-12 core will help!), every little decision could be amplified up to 333 times, so you have to nail it with detailed and thorough planning and comprehensive testing,

Perhaps an area where you’re more likely to hit this limit, is when virtualising desktops, rather than servers.  In most organisations, the number of desktops typically outweighs the number of servers, so hitting the previous limits was much more achievable, so this gives the organisation who happened to be creeping closer, a bit of breathing room.



Leave a Comment
  • Please add 1 and 1 and type the answer here:
  • Post