Hi, I am Hilton Lange, SDE on the Virtual Machine Manager team. Today I want to share with you this blog on how cluster reserve calculations were entirely rewritten in VMM 2012. This rewrite has dramatically reduced the overly conservative results seen in the slot-based approach in VMM 2008, R2, and R2 SP1. Here is a broad summary:
Cluster reserve value of 1.
Consider each host for potential failure:
· Call the largest VM on that host “VLargest” and record its memory “LargestM”.
· Add up the memory of the other running HA VMs on the host. Call that amount “OtherM”
· Now consider each other host in the cluster, keep a running total “TotalExtraCapacity”
o Calculate how much extra capacity that host has until “VLargest” can no longer be placed there:
o Extra capacity = Host total memory – Host memory reserve – VM used memory – Vlargest
o If this extra capacity is non-negative, add this amount to “TotalExtraCapacity”
· If “OtherM” is greater than “TotalExtraCapacity”, the cluster will be shown as overcommitted.
Repeat this test for each host. If none of the hosts fail the test on the final step, your cluster will be shown as healthy!
The same algorithm as above applies, except that you need to consider each set of R hosts that could simultaneously fail. You consider each set, look at the largest VM on the set, and add up all the other VMs on the set as “OtherM”. Obviously the algorithm becomes too cumbersome to check by hand as soon as R exceeds 1 on a reasonably large cluster. Our algorithm will continue to work for all reasonable cluster reserve values, and falls back to a suitable approximation in the unlikely scenario that you have a higher reserve value.
Stopped VMs are considered as running for this algorithm. The rationale is that SSUs may start up their VMs, and we don’t want them to unintentionally overcommit the cluster through that action, bypassing the normal placement checks.
This is a thorny issue. Consider the following worst case: Because of some external load factors, all hosted VMs suddenly expand to their maximum configured memory. Then you experience node failure. Because overcommit is designed to provide a guarantee that HA VMs will be able to failover and start, the only way to give this guarantee is to consider each dynamic memory VM as consuming its maximum size.
However, this entirely negates the additional consolidation and flexibility that dynamic memory provides. A hoster or fabric owner may as well simply allocate all dynamic memory VMs as static with their maximum memory.
Because of this, dynamic memory VMs are considered as their current memory size. This means that cluster overcommit gives you information and a guarantee about what would happen if you experienced node failure right now, but the overcommit status may change if a group of dynamic memory VMs grow in size.
1 million random cluster configurations close to being overcommitted were generated. 130184 of those configurations were actually overcommitted. This is the accuracy rate of the old vs new method.
False positive rate
Real-world data might differ from the random cluster configurations, but we’re expecting to see a tenfold decrease in clusters marked as overcommitted when they’re actually not.
Hope this helps you!
Is this can be calculated by powershell script? It will be good to have such script to make some capacity report at particular time.
I will be glad if you have such script and share it with us.
Currently the overcommit calculation does exactly as described above, and shows a cluster to either be overcommitted, or not overcommitted. There is no measure of how close the cluster might be to capacity. Sometimes a single very large VM can cause a cluster to become overcommitted, or removal of a host, or a sudden increase in dynamic memory across a number of VMs.
That said, there is some value in showing some metric of how near to overcommitment the cluster is. I'll certainly investigate if this is possible, but I'm afraid there is nothing that does this at this time.
Thank you for the reply. I will continue following the blog if new information arrives.
I dont entirely understand where im getting teh numbers from in your equations
Please expand....until “VLargest” can no longer be placed there?
Now consider each other host in the cluster, keep a running total “TotalExtraCapacity”.......add this amount to “TotalExtraCapacity”
what is “TotalExtraCapacity” is it the extra capactity of all hosts added together?
What is VM used memory?
I am trying to understand why mmy cluster goes into overcommited when there is more than enough RAM in any one of two hosts too manage all the VMs i have in my cluster.
my cluster reserve is 1 as i can only handle one host failure.
I have just been migrating my 2008R2 cluster (6 nodes - mostly 64Gb blades & 30 VM) to Server 2012 - comprising a mixture of VM memory sizes (2Gb through to 20Gb) and pretty much each node would stop adding VMs at about 90% of capacity, regardless of the composition, whereas 2008R2 took up to 100%. All of my VMs have static memory.
Hope this helps (or is vaguely relevant!)
I have a similar question as Christopher Mackintosh below. We have an existing 3-node cluster, and recently reached over-committed status. I "stole" some memory from VMs that didn't need it, and cleared that issue. We're building a new cluster in a new datacenter, and changing a few things up, so I'm trying to run the numbers in order to understand how long until we're reaching over-committed status down there. My Cluster Reserve is 1, and I will have 4 Hosts in the cluster with 256GB of memory each. I'm estimating 743GB of allocated memory in VMs, with the largest being 32GB. I plan on turning on the dynamic optimization, so the number of VMs on each host will probably change every so often. Can someone assist me with the proper Hyper-V 2012 calculation for memory?
PLEASE EXPAND keep a running total “TotalExtraCapacity”. is it memory or number of hosts?
Hi,it doesn't look like that the calculation is actually doing the right thing with VM which have dynamic memory. It looks like it is using the max memory and not the actually used.Could this be the case?