Today’s blog concludes the discussion of Hyper-V & Scale Up Virtual machines.

Jeff Woolsey, a Principal Program Manager on the Windows Server team, wrote this blog.

 

Virtualization Nation,

In the last blog, we discussed how Windows Server “8” introduces NUMA for virtual machines and how Hyper-V automatically does the right thing when creating a virtual machine. Let’s take this a step further and discuss what improvements have been made from the Windows 8 Developer Preview to the Windows Server “8” Beta (and Client Hyper-V in Windows 8 Consumer Preview).

Learning from the Developer Preview
Here’s what the GUI looked like in the Windows Server “8” Developer Preview.

 

Figure 2: Windows 8 Developer Preview Hyper-V VM NUMA Settings


During development, we found you were naturally interested in these NUMA settings and started reconfiguring them. (As expected …) However, once you were done experimenting with these settings, you weren’t clear how to correct the situation. What was needed was something like this:

 

Figure 3: Reset Button


And that’s exactly what we did. If you take a look at Hyper-V in Windows Server “8” Beta, the configuration has been redesigned. In addition, you’ll see there’s a new button (highlighted in green) labeled Use Hardware Topology. When you click this button, Hyper-V resets the virtual NUMA topology to the topology of the physical hardware.

 

Figure 4: Windows 8 Beta Hyper-V VM NUMA Settings

So, when in doubt, click the Use Hardware Topology button to get your optimal settings. Finally, I should mention that the virtual machine must be turned off before you can make any NUMA configuration changes or click the Use Hardware Topology button. I don’t know of any operating system that can handle having its NUMA configuration changed on the fly.

One additional benefit of Hyper-V virtual NUMA is that it helps future proof your investment. Whether you’re deploying existing solutions on this version of Hyper-V or architecting new solutions for the future, NUMA is prevalent today and its use is only increasing. In fact here’s another great example with Windows Server “8”: IIS 8.0.

Windows Server “8,” Hyper-V, IIS 8.0 & NUMA
For the first time, IIS is now NUMA aware. From http://learn.iis.net/page.aspx/1095/iis-80-multicore-scaling-on-numa-hardware/

IIS 8.0 addresses this problem by intelligently distributing and affinitizing its processes on Non-Uniform-Memory-Access (NUMA) hardware.

Internet Information Services (IIS) on Windows Server 8 is NUMA-aware and provides the optimal configuration for the IT administrators. Following section describes the different configuration options to achieve the best performance with IIS 8.0 on NUMA hardware.
IIS supports following two ways of partitioning the workload:

1. Run multiple worker processes in one application pool (i.e. web garden).
If you are using this mode, by default, the application pool is configured to run one worker process. For maximum performance, you should consider running the same number of worker processes as there are NUMA nodes, so that there is 1:1 affinity between the worker processes and NUMA nodes. This can be done by setting "Maximum Worker Processes" AppPool setting to 0. In this setting, IIS determines how many NUMA nodes are available on the hardware and starts the same number of worker processes.

2. Run multiple applications pools in single workload/site.
In this configuration, the workload/site is divided into multiple application pools. For example, the site may contain several applications that are configured to run in separate application pools. Effectively, this configuration results in running multiple IIS worker processes for the workload/site and IIS intelligently distributes and affinitizes the processes for maximum performance.

 

In addition, there are two different ways for IIS 8.0 to identify the most optimal NUMA node when the IIS worker process is about to start.

1. Most Available Memory (default)
The idea behind this approach is that the NUMA node with the most available memory is the one that is best suited to take on the additional IIS worker process that is about to start. IIS has the knowledge of the memory consumption by each NUMA node and uses this information to "load balance" the IIS worker processes.
2. Windows
IIS also has the option to let Windows OS make this decision. Windows OS uses round-robin.

Finally, there are two different ways to affinitize the threads from an IIS worker process to a NUMA node.

1. Soft Affinity (default)
With soft affinity, if other NUMA nodes have the cycles, the threads from an IIS worker process may get scheduled to non-affinitized NUMA node. This approach helps to maximize all available resources on the system as whole.
2. Hard Affinity
With hard affinity, regardless of what the load may be on other NUMA nodes on the system, all threads from an IIS worker process are affinitized to the chosen NUMA node that was selected using the design above.

Performance Monitoring & Virtual NUMA
You may be wondering, “How can I verify that the virtual machines I’m running are running using local CPU and memory resources as opposed to remote?” Hyper-V has that too. Take a look in Perfmon and you’ll notice two new counters:

1.  Highlighted in green: For virtual processors there’s the Hyper-V Hypervisor Virtual Processor with a Remote Run Time counter.
2.  Highlighted in red: Under Hyper-V VM VID Partition you’ll see the new Remote Physical Pages counter.
 

Figure 2: New Virtual NUMA Performance Counters


The LOWER the number (zero is ideal) the better. In this case, both numbers are zero (best case) meaning that all virtual processor and memory allocations are local.

Summary
With Windows Server “8” we want to help you cloud optimize your business and give you the ability to host a greater percentage of workloads on Hyper-V. In addition, we want to future proof your investments, which is why this version of Hyper-V :
     •  Supports massive scale-up virtual machines
     •  Introduces virtual NUMA and optimally configures virtual machine topology automatically

Finally, since you may not have a large scale-up system close at hand, I thought I’d leave you with a few screenshots. Cheers, -Jeff

 

Figure 3: Windows Server 2008 R2 SP1 as a Guest with 32 Virtual Processors

 

 

Figure 4: Windows Server “8” Beta as a Guest with 32 Virtual Processors

 

 

Figure 5: Centos 6.2 as a Guest with 32 Virtual Processors