A blog by Jose Barreto, a member of the File Server team at Microsoft.
All messages posted to this blog are provided "AS IS" with no warranties, and confer no rights.
Information on unreleased products are subject to change without notice.
Dates related to unreleased products are estimates and are subject to change without notice.
The content of this site are personal opinions and might not represent the Microsoft Corporation view.
The information contained in this blog represents my view on the issues discussed as of the date of publication.
You should not consider older, out-of-date posts to reflect my current thoughts and opinions.
© Copyright 2004-2012 by Jose Barreto. All rights reserved.
Follow @josebarreto on Twitter for updates on new blog posts.
The File Server Capacity Tool (FSCT) is a free download from Microsoft that helps you determine the capacity of a specific file server configuration (running Windows or any operating system that implements the SMB or SMB2 protocols). It simulates a specific set of operations (the “Home Folders” workload) being executed by a large number of users against the file server, confirming the ability of that file server to perform the specified operations in a timely fashion. It makes it possible to verify, for instance, if a specific file server configuration can handle 10,000 users. In case you’re not familiar with FSCT’s “Home Folders Workload”, it simulates a standard user’s workload based on Microsoft Office, Windows Explorer, and command-line usage when the file server is the location of the user’s home directory.
We frequently use FSCT internally at Microsoft. In fact, before being released publicly, the tool was used to verify if a specific change to the Windows code has any significant performance impact in a file server scenario. We continue use FSCT for that purpose today.
Recently, the File Server Team released a document (available at http://www.microsoft.com/downloads/en/details.aspx?FamilyID=89a73dd0-ed31-4cc2-aa7d-2fded8a023ab) with results from a series of FSCT tests. These tests were performed in order to quantify the file server performance difference between Windows Storage Server 2008 (based on Windows Server 2008) and Windows Server 2008 R2. It was also an exercise to analyze the capacity (in terms of FSCT “Home Folders” users) of some common File Server configurations using between 24 and 192 disks.
2. Comparing Windows Server 2008 and Windows Server 2008 R2 with 24 spindles
The document includes details about how the tests were performed, what specific hardware configurations were used and what was the CPU, memory, disk and network utilization in each case. It organizes the results by operating system, showing results for all Windows Storage Server 2008 (based on Windows Server 2008) configurations, then the results of all Windows Server 2008 R2. However, I find it even more interesting to compare two identical hardware configurations running the two different versions of Windows. You can clearly see how the software improved over time. For instance, you see below how a 24-spindle configuration went from supporting 4,500 FSCT users to supporting 7,500 FSCT users. Note how Windows Server 2008 R2 was able to squeeze more out of the server, with increased CPU, memory, disk and network utilization:
* This is actually Windows Storage Server 2008, which is built on Windows Server 2008.
This table provides an interesting snapshot of many items that matter to capacity planning. For instance, you can see how we’re not really hitting bottleneck on CPU, storage or network. My conclusion here is that we’re bound by the random access performance of the individual drives (random IOPs) and we would need to add more spindles to achieve more users per server. If your goal is to provide a “Home Folders” file service to around 5,000 users and want to save money, you could go the other way and decide to tweak TESTBED-F and use a system with less RAM (since we’re not hitting that) or even configure the system with dual 1GbE network interfaces instead of 10GbE (since dual 1GbE can provide you with a around 220MB/sec). However, if you do want to change the configuration, you would need to run the tests again, since there could be other interactions when you change the hardware like that.
3. Comparing Windows Server 2008 and Windows Server 2008 R2 with 96 spindles
In a similar fashion, a 96-spindle configuration went from supporting 9,500 FSCT users to an impressive 16,500 FSCT users. Again, nothing was changed in the hardware to achieve that improvement. It was just a matter of going from Windows Storage Server 2008 (based on Windows Server 2008) to Windows Server 2008 R2 (and effectively using SMB2 version 2.1 instead of SMB2 version 2.0).
Again, you would need to look deep to understand your bottleneck here. While FSCT will provide you with a lot of performance counters, you need a human to figure out what is holding you back. Clearly it’s not memory or CPU. Your network also is not at max capacity yet (in theory, you could hit at least twice what is being used by the TESTBED-E using 10GbE). So, again, the bottleneck here has to be the storage. As I mentioned before, If your goal is to configure a system to provide service to around 10,000 users, you could probably play with TESTBED-E’s configuration a bit (use less memory, use just one processor instead of two, reduce the number of disks) to shrink the overall acquisition cost a little while keeping the performance at a good level for that number of users. Again, you would need to rerun FSCT with that new configuration to be sure.
4. Running Windows Server 2008 R2 with 192 spindles
The document also includes a 192-spindle configuration using Windows Server 2008 R2. This is one of the most impressive FSCT results I have ever seen. In this test, a single file server was able to successfully handle 23,000 FSCT users running the “Home Folders” workload simultaneously. I wonder if you could find a similar NAS appliance configuration out there able to handle this number of FSCT users... Here are the results:
In this configuration, it is much harder find the bottleneck. We have a good amount of free memory, but we’re hitting a fairly high CPU utilization for a file server workload. Both the storage and the network are fairly busy as well at around 600 MB/sec. Also note that we’re using RAID-0 here, so this configuration is not realistic for a production deployment.
5. Charts and Diagrams
Each of the configurations includes also a chart with the throughput (in FSCT scenarios per second), CPU utilization and total number of FSCT users the configuration can handle, as you can see below. These charts were created using Microsoft Excel and the text results provided by FSCT. For example, here’s the chart for the 192-spindle configuration:
The document also provides information about the hardware used in each of the configurations, including disks, arrays, storage fabric, server, network and clients used to generate the load. There is enough information there to allow you to reproduce the tests in your own environment or lab. For instance, here’s a diagram of the 192-spindle configuration:
6. Table of Contents
This blog post provides just a sample of the information contained in the document. Here is the full table of contents:
As you can see, the document is rich in detail. If your work is related to planning, sizing or configuring file servers, it could be very useful.
I would highly recommend downloading the full document from http://www.microsoft.com/downloads/en/details.aspx?FamilyID=89a73dd0-ed31-4cc2-aa7d-2fded8a023ab
I would also encourage you to experiment with FSCT yourself. You can start at http://blogs.technet.com/b/josebda/archive/2009/09/16/file-server-capacity-tool-fsct-1-0-available-for-download.aspx
The File Server Capacity Tool (FSCT) is a free download from Microsoft that helps you determine the capacity
Nice to see published numbers. Is there any data on the memory impact of having the server and client seperated by a WAN? I am thinking of the impact of tcp-autotuning. For example, if the users are seperated from the server by 15ms of latency over a 1 Gig pipe then the tcp recieve window on the server may scale up to 2MB. Would it be fair to say an extra 2MB of RAM is required on the server for each user in this scenario?
Introduction The File Server team often talks to customers about file server migration and file server