Thanks for visiting our blog! I’m a development lead in the Windows Server Performance team and I led the performance effort on Hyper-V for Windows Server 2008 over the past three and a half years.
We’ve worked with the product team throughout the Hyper-V development cycle to deliver a competitive product and we’re excited about shipping Hyper-V RTM this year, with the Hyper-V Beta shipping in Windows Server 2008 this week!
Hyper-V uses a hypervisor-based architecture and leverages the driver model of Windows for broad hardware support. The hypervisor partitions a server into containers of CPU and memory. As a micro-kernel, it provides mechanisms for inter-partition communication upon which our new high-performance synthetic I/O architecture is built. The root partition owns physical I/O devices and provides services including I/O implemented by the virtualization stack to the child partitions.
The virtualization stack implements emulated I/O devices such as an IDE controller and a DEC 21140A network adapter. However, it is expensive to virtualize such devices. Sending a single I/O might require multiple trips between the virtualization stack and child partition. Instead, Hyper-V exposes synthetic I/O devices that are specially designed for VM environments. These devices are attached to VMBus, which is a plug-and-play capable bus that uses shared memory for efficient inter-partition communication. The Windows guests detect the devices on VMBus and loads the appropriate drivers.
Synthetic I/O in Hyper-V uses a client-server architecture with Virtualization Service Providers (VSPs) in the root and Virtualization Service Clients (VSCs) in the child. This architecture significantly reduces the cost of sending an I/O. Virtual Server customers should observe a major reduction in CPU usage in I/O-intensive loads when they migrate their VMs to Hyper-V.
In addition, we developed operating system enlightenments for Windows Server 2008, which make the NT kernel and memory manager smarter in VM environments, again to reduce the cost of virtualization.
For this first blog post, I want to highlight one of the major performance features in Hyper-V: multi-processor virtual machines. Hyper-V supports 4P VMs for Windows Server 2008 guests and 2P VMs for Windows Server 2003 SP2 guests. For more intensive server workloads, you might consider virtualizing them in 2P or 4P VMs on Hyper-V. Of course, you should use multi-processor VMs only if the workload requires it since there is some cost to having additional processors.
However, operating system kernels and drivers use spin locks which do not block and spin until the lock is acquired, with the assumption that the lock is held for a short period. Virtualization breaks this assumption as virtual processors (VPs) are time-sliced. If a VP is preempted while holding a spin lock, other VPs may spin for a long time wasting CPU cycles.
We developed innovations in the hypervisor and Windows Server 2008 kernel to try to prevent long spin wait conditions and also to efficiently detect and handle them when they do occur. We also designed the hypervisor, including the scheduler and memory virtualization logic, to be lock-free on most critical paths to ensure good scalability on multi-processor systems.
As a result, Windows Server 2008 as a 4P guest scales well compared to the physical 4P system. This is one example of Windows Server 2008 as a guest and Hyper-V together providing performance advantages. We plan to continue to improving our scalability on multi-processor systems and multi-processor VMs in subsequent releases.
Thanks for reading this far! I would encourage you to try Hyper-V Beta in Windows Server 2008, which launched this week. And take a look at the Windows Server 2008 and Virtualization web site for more information.
I look forward to writing more on our work on Hyper-V performance. Please add our blog to your RSS feeds!
Senior Development Lead
Windows Server Performance Team
If the guest OS is the Windows 2003 R2 ENT edition, up to how many CPUs can the HyperV "provide" to the vm ?
Hi Christos, Hyper-V supports up to 2 virtual processors for a Windows 2003 R2 ENT guest OS. For a complete listing of the supported number of virtual processors for other guest OS, please see
I was wondering what happens where I have a single threaded application that can only take advantage of a single processor but I choose to run it on a VM that supports up to 4 virtual processors (i.e. Windows 2008 R2). Will my VM be scheduled with all four
virtual CPUs even though it only needs one to execute ?
I can't see anything on any of Microsoft's sites that explicitly clarifies this, however, V2.0 of the Hypervisor Functional Specification in section 17 "Parition Save and Restore" seems to imply that this is the case in 17.1.8 "... Create new virtual processors.
The count and IDs of the virtual processors should match those of the partition that was previously saved.". That said, in other places individual processors can be relinquished when events such as spinlocks occur.
The behavior in a virtualized OS is similar to running natively: If an application is single threaded, then it can only run on one processor at any
given point in time with the possibility of migrating to other processors based on other activities in the VM. Other activities include interrupts, DPCs, processes, and system threads.
With that in mind, if you are creating a VM to primarily run a single threaded application, creating a VM with 2VPs makes sense because another VP can
be available for other system work. If this is a VM that has been saved and started again as a result of Live Migration or something similar then the VM will come up with 4VPs and the application will only use one.
If I may ask a supplementary question, if I have a single threaded CPU bound application (i.e. a scientific workload) and that is the only work I wish to run on that VM then, as I understand the previous answer, creating a VM with 4VPs will result in 4 logical
processors being scheduled to the VM each and every time it is dispatched by Hyper-V. So even though my application can only use one VP at a time (dispatched by Windows across all 4VPs) most of the time (Windows overheads and interrupts for this type of workload
are small), Hyper-V will still schedule 4VP.
Does this mean that approximately 3 logical processors are effectively wasted, because no other VM can get access to these LPs while the application is running on this VM and that I would be better off configuring the VM with 1 or 2 LPs?
If you are running a single threaded application, then you are better off configuring the VM with 1 or 2 virtual processors (VPs).
More information about configuring the VM in terms of VP count and monitoring performance can be found on
The Tuning Guide covers things like how VPs are scheduled on the system's logical processors (LPs) and other topics you might find useful
for your use case.