Drive up networking performance for your most demanding workloads with Virtual RSS

Drive up networking performance for your most demanding workloads with Virtual RSS

  • Comments 10
  • Likes

This post is a part of the nine-part “What’s New in Windows Server & System Center 2012 R2” series that is featured on Brad Anderson’s In the Cloud blog.  Today’s blog post covers Virtual RSS and how it applies to the larger topic of “Transform the Datacenter.”  To read that post and see the other technologies discussed, read today’s post: “What’s New in 2012 R2: IaaS Innovations.”

As described in Transforming your datacenter – networking, Cloud Scale Performance and Diagnosability is a key capability expected of today’s networks. Customers demand high performance, with the ability to maximally utilize capacity available. In this blog post, we will describe one such enhancement to drive up performance for networking-intensive virtualized workloads – Virtual RSS (vRSS).

Virtual Receive Side Scaling (vRSS)

As we began to plan Windows Server 2012 R2, we heard from our customers that they were unable to virtualize networking intensive workloads. As we looked into it, we realized that although these VMs had multiple virtual processors (VP) only one was being used for network traffic and that VP was being completely consumed by network processing on high speed networks. This meant that VMs with multiple VPs and high networking workloads, such as File Servers, were being effectively limited to using one VP to keep up with incoming data indications even though it had more VPs allocated. Clearly, this does not scale to cloud needs.

Prior to 10 gigabit networking, one modern processor was usually more than enough to handle the networking workload of a VM. With the introduction of 10Gb/s NICs life became more complicated as the amount of data being sent to and received from a VM exceeded what a single processor could effectively handle. Our performance investigations found that since all network traffic was being processed on a single VP, a single VM was limited to (on average) 5Gbps, far below the full potential of the hardware installed in the system. The image below is a screen shot of the Task Manager from within a VM. In this figure, VP3 is clearly being fully utilized and cannot support any additional traffic processing, even though it has 8 VP allocated.

clip_image001

Fortunately this problem was not new. Prior to this release, we had encountered a similar situation with the introduction of multi-core machines for physical workloads. That experience had produced Receive Side Scaling (RSS). RSS spreads traffic from the network interface card (NIC), based on TCP flows, to multiple processors for simultaneous processing of TCP flows. I will not go in to the details of RSS but inquiring minds can read more on RSS at this link, Receive Side Scaling. This enabled physical workloads to optimally utilize bandwidth and cores available.

Similar to how RSS distributes networking traffic to multiple cores in physical machines, vRSS spreads networking traffic to multiple VPs in each VM by enabling RSS inside the VM. With vRSS enabled, a VM is able to process traffic on multiple VPs simultaneously and increase the amount of throughput it is able to handle.

vRSS is managed in the VM the same way RSS is managed on a physical machine. In the VM open a PowerShell instance with administrator rights. Type the following cmdlet and substitute your network connection in the –Name field or simply use “*” to enable across all adapters.

PS C:\> Enable-NetAdapterRss –Name “Ethernet”

Rerunning the same test as shown above now gives much improved results. Again, the Task Manager from the VM is shown. The processing is now distributed to all the VPs and the VM is handling 9.8 Gbps of network traffic, double the previous throughput and effectively line rate on our 10G NIC! The best part about this new feature is it didn’t require anyone to install or replace any hardware; this was all done by maximizing the use of existing resources in the server.

clip_image002

You’re probably thinking to yourself, “What’s the catch?” There is a catch and that’s the reason why vRSS is not enabled by default on any VMs. There are extra calculations that must be done to accomplish the spreading which leads to higher CPU utilization in the host. This means that small VMs with minimal or average network traffic will not want to enable this feature. This feature is meant for VMs that process high network traffic like file servers or gateways.

In summary, vRSS adds on to the list of high performance features available in Windows Server 2012. By driving up performance on existing high speed NICs, service providers can see greater ROI and satisfy the needs of the most demanding workloads.

Give it try, and let us know what you think!

Gabriel Silva, Program Manager, Windows Core Networking Team

To see all of the posts in this series, check out the What’s New in Windows Server & System Center 2012 R2 archive.

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • Hi, this is interesting stuff. I'm only just playing catchup on features in SP1 so I'm probably missing something, but what is the difference between vRSS and VMDQ? From what I understand, VMDQ is basically RSS for VMs, but that seems to be true for vRSS too. Can you clarify or point me in the right direction? Thanks!

  • @Ross Taylor, vRSS spreads the load in the VM and the host to multiple processors. vRSS was implemented in response to one of the drawbacks to VMQ where only one core in the VM and host was used to process network traffic. I recently started a series of blog posts specifically to dive deep into the implementation of VMQ and address these kinds of questions.  You can find it here, blogs.technet.com/.../vmq-deep-dive-1-of-3.aspx.  

  • I find in Server 2012 that a vNIC on a host with 2 x 10G NIC's teamed used for Live Migration is max'd out at around 3.5-5G, do you think this is a related issue?

    Or to word this differently, are host vNICs also constrained in some way in 2012 that will be improved in R2?

  • I am still puzzled by this...your example clearly shows that traffic initiated from VM will use multiple VPs. How about the traffic initiated from outside? Will it be spread across multiple cores, or will it be directed through a single core which is assigned to a queue that this VM is assigned to?

  • Can I consider the parent partition a VM in a converged networking scenario where the physical NIC's are in a LBFO team with a vSwitch on top and vNIC's for management, cluster and livemigration? Can I run the command 'Enable-NetAdapterRss –Name “LiveMigration” there?

  • Hi
    I'm interested. However, I have a few questions.

    Which operating system in a virtual machine supports vRSS?

    Is Windows Server 2012 R2 supported only?

    Can I perform Enable-NetAdapterRss in a virtual machine which is installed Windows Server 2012 and the Hyper-V host is Windows Server 2012 R2?

  • @Gary Hay - vRSS is a new feature in 2012R2 so there is not spreading for any vNIC prior to 2012R2. VMQ is the only technology for pre-2012R2 OS. Although, the host vNIC does not support vRSS in 2012R2 which is why you are seeing that you are still limited to one core, vRSS is limited to VMs only.

  • @SrdjanM - I don't quite understand your question. In this example, traffic is being initiated from a separate physical box acting as a sender to a VM in another box which is our receiver.

  • @Don - Along the same lines as my answer to Gary above, since Live Migrations and cluster traffic run through the host vNIC they do not get vRSS in 2012R2. We understand this limits the host and is something we are actively looking into.