My name is Flavio Muratore and I am a Senior Support Escalation Engineer with the Windows Core Team. One subject we haven’t written much about in the Core team blog is “disk performance”.
Today I would like to talk a little bit about measuring Physical Disk IO Latency with Windows Performance Monitor (perfmon). Most likely you have some experience with Perfmon, it’s been around since the NT days. You have probably heard general statements about what are acceptable disk latency measurements: “Less than 10 milliseconds is good and more than 20 milliseconds is bad”. Although these rules of thumb are used to simplify analysis, they do not apply in all cases and may lead to incorrect conclusions. Let’s check how this really works so we can understand these numbers.
Summary: The IO latency measured in perfmon includes all the time spent in the hardware layers as well as the time spent in the Microsoft Port Driver queue (Storport.sys for SCSI). If the running processes generate a large storport queue, the measured latency increases, as IO has to wait before getting dispatched to the hardware layers.
What is disk IO latency? We can define disk IO latency as: A measure of the time delay from the time a disk IO request is created, until the time the disk IO request is completed.
What counters in Windows Performance Monitor show the physical disk latency? “Physical disk performance object -> Avg. Disk sec/Read counter” - Shows the average read latency. “Physical disk performance object -> Avg. Disk sec/Write counter” - Shows the average write latency. “Physical disk performance object -> Avg. Disk sec/Transfer counter” - Shows the combined averages for both read and writes. The “_Total” instance is an average of the latencies for all physical disks in the computer. Each other instance represents an individual Physical Disk.
Note: Do no confuse with Avg. Disk Transfers/sec, which is a completely different counter.
Where does the performance data comes from? For the “physical disk performance object”, the data is captured at the “Partition Manager” level in the storage stack. Keep in mind Perfmon does not create any performance data per se; it only consumes data provided by other subsystems within Windows.
Where is the partition Manager in the Storage Stack?
A simplified explanation on the Windows Storage Stack follows. When an application creates an IO request, it sends it to the Windows IO Subsystem (at the top of the stack). The IO will then make its way all the way down the stack (to the Hardware Disk Subsystem) and then come all the way back up. During this process, each layer will perform its function and then hand over the IO to the next layer.
So what are we really measuring with the Physical disk performance object -> Avg. Disk sec/Transfer (or /Read, or /Write) counter? We are measuring all the time spent below the partition manager level. When the IO request is sent by the Partition Manager down the stack we time stamp it, when it arrives back we time stamp it again and calculate the time difference. The time difference is the latency.
This means we are accounting for the time spent in the following components:
How disk queuing affects the measured latency in Perfmon? There is only a limited number of IO a disk subsystem can accept at a given time. The excess IO gets queued until the disk can accept IO again. The time IO spends in the queues below the Partition Manager is accounted in the Perfmon physical disk latency measurements. As queues grow larger and IO has to wait longer, the measured latency also grows.
There are a multiple queues below the Partition Manager level:
Finally, special attention to the Port Driver Queue (for SCSI Storport.sys). The Port Driver is the last Microsoft component to touch an IO before we hand it off to the vendor supplied Device Miniport Driver. If the Device Miniport Driver can’t accept any more IO because its queue and/or the hardware queues below are saturated, we will start accumulating IO on the Port Driver Queue. The size of the Microsoft Port Driver queue is limited only by the available system memory (RAM) and can grow very large, causing large measured latency. In Conclusion: The time the IO spent in queue is added to the disk latency in perfmon.
To keep the queue under control you have to tune your applications to limit the maximum number of outstanding I/O operations they generate. That’s a subject for another blog post.
Reference: For SCSI Disks (FC/RAID) you can enable Storport tracing to measure the latency below the Port Driver level. This does not account for the time spent in the storport queue or anything above. Essentially this is the lowest level we can possibly monitor the latency inside Windows before the IO is handed over to third party components. Check this excellent blog from NTdebug team for details. “Storport ETW Logging to Measure Requests Made to a Disk Unit” http://blogs.msdn.com/b/ntdebugging/archive/2010/04/22/etw-storport.aspx
Flavio Muratore Senior Support Escalation Engineer Microsoft Enterprise Platforms Support