Microsoft Enterprise Platforms Support: Windows Server Core Team
My name is Flavio Muratore and I am a Sr. Support Escalation Engineer with the Windows Core team at Microsoft. If you ever find yourself analyzing storage performance with Performance Monitor, this post is for you. We will go beyond very brief descriptions provided in Perfmon and describe how we calculate the data for the Physical and Logical disk counters.
Why the Performance Monitor? When it comes to the subject of disk performance in Windows, the majority of questions can be quickly answered by Performance Monitor alone. Performance Monitor is very low overhead, does a great job with averages and can also capture and store data over long periods of time. It is an excellent choice to record a performance baseline and to troubleshoot. For short in this text, we are going to call the Windows Performance Monitor by its nickname: Perfmon. The nickname comes from its executable file located at %systemroot%\system32\Perfmon.exe.
There are some things Perfmon will not be able to tell us. For advanced analysis, Windows provides us with xPerf, enabling state of the art performance data capture through Event Tracing for Windows (ETW). There is an excellent bog on the subject by Robert Smith (Sr. PFE/SDE). “Analyzing Storage Performance using the Windows Performance Analysis ToolKit (WPT)”.
What is the difference between the Physical Disk vs. Logical Disk performance objects in Perfmon?
Perfmon has two objects directly related to disk performance, namely Physical Disk and Logical Disk. Their counters are calculated in the same way but their scope is different. The Physical Disk performance object monitors disk drives on the computer. It identifies the instances representing the physical hardware, and the counters are the sum of the access to all partitions on the physical instance. The Logical Disk Performance object monitors logical partitions. Performance monitor identifies logical disks by their drive letter or mount point. If a physical disk contains multiple partitions, this counter will report the values just for the partition selected and not for the entire disk. On the other hand, when using Dynamic Disks the logical volumes may span more than one physical disk, in this scenario the counter values will include the access to the logical disk in all the physical disks it spans.
Disk Counters Explained.
%Disk Time (% Disk Read Time, % Disk Write Time) The “% Disk Time” counter is nothing more than the “Avg. Disk Queue Length” counter multiplied by 100. It is the same value displayed in a different scale. If the Avg. Disk queue length is equal to 1, the %Disk Time will equal 100. If the Avg. Disk Queue Length is 0.37, then the %Disk Time will be 37. This is the reason why you can see the % Disk Time being greater than 100%, all it takes is the Avg. Disk Queue length value being greater than 1. The same logic applies to the % Disk Read Time and the % Disk Write Time. Their data comes from the Avg. Disk Read Queue Length and Avg. Disk Write Queue Length, respectively.
Avg. Disk Queue Length (Avg. Disks Read Queue Length, Avg. Disk Write Queue Length) Avg. Disk Queue Length is equal to the (Disk Transfers/sec) *( Disk sec/Transfer). This is based on “Little’s Law” from the mathematical theory of queues. It is important to note this is a derived value and not a direct measurement, I recommend reading this article from Mark Friedman, the information still applies to Windows 2008 R2. As you would expect, the Avg. Disk Read Queue Length is equal to the “(Disk Reads/sec) * (Disk sec/Read)” and Avg. Disk Write Queue Length is equal to the “(Disk Writes/sec) * (Disk sec/Write)”.
Current Disk Queue Length Current Disk Queue Length is a direct measurement of the disk queue present at the time of the sampling.
% Idle Time This counter provides a very precise measurement of how much time the disk remained in idle state, meaning all the requests from the operating system to the disk have been completed and there is zero pending requests. This is how it’s calculated, the system timestamps an event when the disk goes idle, then timestamps another event when the disk receives a new request. At the end of the capture interval, we calculate the percentage of the time spent in idle. This counter ranges from 100 (meaning always Idle) to 0 (meaning always busy).
Disk Transfers/sec (Disk Reads/sec, Disk Writes/sec) Perfmon captures the total number of individual disk IO requests completed over a period of one second. If the Perfmon capture interval is set for anything greater than one second, the average of the values captured is presented. Disk Reads/sec and Disk Writes/sec are calculated in the same way, but break down the results in read requests only or write requests only, respectively.
Disk Bytes/sec (Disk Read Bytes/sec, Disk Write Bytes/sec) Perfmon captures the total number of bytes sent to the disk (write) and retrieved from the disk (read) over a period of one second. If the Perfmon capture interval is set for anything greater than one second, the average of the values captured is presented. The Disk Read Bytes/sec and the Disk Write Bytes/sec counters break down the results displaying only read bytes or only write bytes, respectively.
Avg. Disk Bytes/Transfer (Avg. Disk Bytes/Read, Avg. Disk Bytes/Write) Displays the average size of the individual disk requests (IO size) in bytes, for the capture interval. Example: If the system had ninety nine IO requests of 8K and one IO request of 2048K, the average will be 28.4K. Calculation = (8k*99) + (1*2048k) / 100 The Avg. Disk Bytes/Read and Avg. Disk Bytes/Write counters break down the results showing the average size for only read requests or only write requests, respectively.
Avg. Disk sec/Transfer (Avg. Disk sec/Read, Avg. Disk sec/Write) Displays the average time the disk transfers took to complete, in seconds. Although the scale is seconds, the counter has millisecond precision, meaning a value of 0.004 indicates the average time for disk transfers to complete was 4 milliseconds. This is the counter in Perfmon used to measure IO latency. I wrote a blog specifically about measuring latency with Perfmon. For details got to “Measuring Disk Latency with Windows Performance Monitor”. Split IO/Sec Measures the rate of IO split due to file fragmentation. This happens if the IO request touches data on non-contiguous file segments. For an explanation about file segments see this blog from Robert Mitchell - The Four Stages of NTFS File Growth.
Logical Disk Counters Exclusive Counters The Logical Disk performance object has all the same counters as the physical disk, and except for the fact they are reported per logical unit instead of physical device, they are calculated in the same way. Because the physical disk counter does not understand volumes, the following counters are exclusive to the Logical Disk Object.
% Free Space Display the percentage of the total usable space on the selected logical disk that was free.
Free Megabytes Displays the unallocated space, in megabytes, on the volume. How can we quickly tell how much free space is available in the volume? Check this blog from Robert Mitchell – NTFS Metafiles.
A few words about performance monitor counters averaging and rounding: Perfmon is really good at averaging results and rounding numbers, this enables us to have relatively small log files and extract useful the information from the data captured. Although the numbers displayed to the user during a live capture and the numbers saved in the log files are rounded, the numbers used in the internal calculations are more precise. When reading the description for some counters in this blog, you probably noticed Perfmon has to calculate an average of averages, this leads to small imprecisions on the final numbers. Also, when we combine this with instances that do further rounding and averaging, like the “ _Total instance”, you will see some results are close but do not add up exactly. For example, if you get the “Disk Transfers/sec” over a period of time and subtract both the “Disk Reads/sec” and the “Disk Writes/sec” the resulting number may not be exactly zero.
This is expected and does not pose a problem to the performance analysis at this level. If you can’t tolerate these small imprecisions you will need to use xPerf. xPerf does event tracing and all data is kept with no averaging or rounding. The downside is the resulting log files with xPerf are much bigger than the ones Perfmon creates.
Conclusion: The Windows Performance Monitor is a very powerful diagnostic tool and is capable of answering most questions about the state of disks on the fly. Perfmon uses averaging and rounding to keep only meaningful data in its log files, thus allowing captures over a long period of time.
I must thank a bunch of Microsoft fellows for helping me with this blog. Big thanks to Bruce Worthington (Principal Development Lead), without your knowledge I would not be able to finish this blog. Thanks also to Mark Licata (Principal SE), Robert Smith (Sr. PFE), Clint Huffman (Sr. PFE), John Rodriguez (Principal PFE), Steven Andress (Sr. SEE) and the Storage performance discussion group at Microsoft. It seems so simple now, but it took a lot of sweat to get the exact data to make sure this information is accurate.
Y’all have fun with Perfmon.
Flavio Muratore Senior Support Escalation Engineer Microsoft Enterprise Platforms Support
In most cases Disk Reads/sec, Disk Writes/sec & Current Disk Queue is enough to understand system status and define bottleneck. sometime you will need to drill down into processes\physical disks counters to check which process do most reads\writes.
Awesome stuff, Flavio! I'm a PFE for Exchange and was curious what tips you would suggest for someone trying to determine if latency is coming from outside storport (so the storage system itself or components in between the host and the storage), something like antivirus scanning, or other unknowns. I've played with storport logging, but only being able to set a threshold and not know the total # of packets passed during the logging makes it hard to say if the resulting log is 1% of all packets or 50% of all packets over the configured threshold and if a deeper look is warranted. Thank you in advance!
It may just be me, but the link to Measuring Disk Latency with Windows Performance Monitor goes to a "Bad Request" page. Are you sure that link is correct? I would really love to read this article.
We try to use Perfmon to diagnose and troubleshoot storage performance on our Windows 2008 R2 systems. However, it seems that there are many very basic bugs with the Perfmon GUI interface under 2008 and 2008 R2.
It is clear that only basic QA testing was done, as these bugs are replicable on any system (and have been logged under case REG:112021466743035 by myself). This is pretty disturbing stuff as it calls into question the reliability of the entire product. If they haven't fixed the obvious bugs, how can we rely on anything else it tells us?
The problem is very simply replicated: if you have multiple (i.e. more than 10) LUNs or disks on a system, try creating a New Data Collector Set. Add in a handful of individual counters (i.e. don't use Total or All). Do the same for the instances (LUNs), i.e. select them individually instead of using All.
The result is that random counters are dropped from the set. You THINK you just selected Reads/sec, Writes/sec, Disk Queue, and Split IO/sec for your 10 LUNs, but what you actually GET is only some of these.
Other times you click OK and NONE of these are added to the data collector set. The list is blank.
This is a massive issue for us. We have systems with 100+ LUNs and we wish to collect certain counters only for certain LUNs because if we select everything (* for All Instances) our logs get massive. But the only workaround is to do that; to select everything. Not very useful.
This is my 6th logged-and-confirmed MS bug and 5 of them have revolved around basic issues with MS products where large-scale deployments have simply not been tested. Try using MS Cluster or frequent VSS snapshots with 30+ LUNs and you'll quickly see what I mean. It's very concerning because it means MS seems to be skimping on QA... and it's the big customers that will suffer (and consueqently not choose MS products). But this one was especially annoying as it prevents Perfmon doing what it was designed for, i.e. act as a reliable troubleshooting tool. I hope this gets fixed soon.
is there a place that explains terms like latency, IOPS etc?
Thank you for explaining, this is helpfull information.
This is more of a general PerfMon question but I have not been able to find an answer anywhere so I was hoping the Performance or Core Team may know.
I'm trying to troubleshoot an intermittent system hang which causes the system to become completely unresponsive to the point where a reboot is required. When using a data collector set in Performance Monitor, the data is apparently buffered in memory until some point where the buffer fills up and flushes the data to disk (writing it to the .blg file). Since the data only occasionally gets written to disk, when the system hangs and is reset, all of the log data that was in the buffer is lost. I want to be able to log data up to (or nearly up to) the point where the system hangs but this buffering mechanism doesn't allow that to happen.
Is there a way to force counter data to be flushed to disk at a specific interval or just turn off buffering altogether so that the data is written to disk at each sample interval?
Great Article - Exactly what I was looking for