While looking on a Exchange 2010 server recently in task manager to review the amount of CPU utilization, I noticed that Processor 0 was at 100% CPU while all of the other CPUs were relatively lower compared to this processor.This type of behavior is caused by the Receive Side Scaling (RSS) feature not being enabled on the server. RSS is a feature that was first implemented back in Windows 2003 with the Scalable Networking Pack which allows you to span network traffic across multiple CPU cores. If RSS is not enabled, only *one* CPU will be used to process incoming network traffic which could cause a networking bottleneck on the server.Additional information on RSS can be found here.
Here is what it looks like in Task Manager on the Performance tab.
As you can see, the first processor is pegged at 100% CPU which is indicative of RSS not being enabled. Generally on new installations of Windows 2008 or greater, this feature is enabled by default, but in this case, it was disabled.
Prior to enabling RSS on any given machine, there are a few dependencies that are necessary for RSS to work properly and are listed below.
After enabling RSS, you can clearly see below the difference in processor utilization on the server as the CPU utilization for Processor 0 now fairly close to the other processors right around 3:00AM.
Many people have disabled the Scalable Networking Pack features across the board due to the various issues that were caused by the TCP Chimney feature back in Windows 2003. All of those problems have now been fixed in the latest patches and latest network card drivers, so enabling this feature will help increase networking throughput almost two fold. The more features that you offload to the network card, the less CPU you will use overall. This allows for greater scalability of your servers.
You will also want to monitor the amount of deferred procedure calls (DPC) that are created since there is additional overhead for distributing this load amongst multiple processors. With the latest hardware and drivers available, this overhead should be negligible.
In Windows 2008 R2 versions of the operating system, there are new performance counters to help track RSS/Offloading/DPC/NDIS traffic to different processors as shown below.
Stack Send Complete Cycles/sec Miniport RSS Indirection Table Change Cycles Build Scatter Gather Cycles/sec NDIS Send Complete Cycles/sec Miniport Send Cycles/sec NDIS Send Cycles/sec Miniport Return Packet Cycles/sec NDIS Return Packet Cycles/sec Stack Receive Indication Cycles/sec NDIS Receive Indication Cycles/sec Interrupt Cycles/sec Interrupt DPC Cycles/sec
Tcp Offload Send bytes/sec Tcp Offload Receive bytes/sec Tcp Offload Send Request Calls/sec Tcp Offload Receive Indications/sec Low Resource Received Packets/sec Low Resource Receive Indications/sec RSS Indirection Table Change Calls/sec Build Scatter Gather List Calls/sec Sent Complete Packets/sec Sent Packets/sec Send Complete Calls/sec Send Request Calls/sec Returned Packets/sec Received Packets/sec Return Packet Calls/sec Receive Indications/sec Interrupts/sec DPCs Queued/sec
I hope this helps you understand why you might be seeing this type of CPU usage behavior.
Until next time!!
The Scalable Networking Pack (SNP) caused us a tremendous amount of grief several years ago. We actually dumped Virtual Server because we thought it was the problem with our Windows Server 2003 file server instability (by the way, Microsoft tech support was of no help in diagnosing the problem), only to finally find out on our own that it was SNP that was causing us the problems and yes, we were on the latest Dell drivers at the time.
Thank for posting this article and acknowledging the numerous issues Microsoft had with this implementation. One can only hope that the Windows Server 2008 implementation is better. If the Exchange 2010 sysadmin had intentionally turned off RSS, I couldn't blame the person.
To be honest, this utlimately had to do with older network card driver versions and the Chimney feature being enabled when the service pack came out. There were a lot of drivers out there that were not fully tested with the SNP features which caused everyone to start immediately looking at these new networking features being the heart of the problem. Servers that had the latest network card drivers didn't feel as much pain.
I've worked extensively with the Networking Core team to test a lot of these features out and they have fixed all of those issues including all the problems seen on Windows 2003. As we try to scale our servers to higher levels now, any improvement in the OS layer will surely help with scalability and RSS is just one of those that will help distribute the load amongst CPU cores more evenly. I haven't even touched on other features such as Large Send Offload (LSO) that could also increase performance on your servers. The more work you have the network card perform, the less time spent in CPU cycles.