“How do I size file servers?” is a perennial question, and we recently had a mail thread in response to a customer question about sizing file server clusters. We do have some guidance about server cluster capacity planning and file server RAM and CPU sizing in the Windows Server 2003 Deployment Kit. Elden Christensen, a resident cluster expert, sums the answer up nicely:

No special performance considerations when planning scalability of a stand-alone server compared to a clustered server.  The number of supported user connections, volume sizing, etc... Is all the same.

The one consideration with clustering is that you need to do performance planning for your worst case failure scenario.  So say for example you build a Active/Active file server, with 2000 users connected to each node.  In the event that one of the nodes goes down, this means that a single node will be hosting all resources and all 4000 users will attempt to connect to a single node.  With this in mind you need to consider what the worst case performance degradation you will accept for users.  If you require that in the event of a failure that the user experience should be near the same, then you would not want to exceed 100% utilization of a single node if it is the last remaining node.  This means that in a normal situation, you would not want to exceed 50% utilization of either node... So that in the event of a failover, the node hosting all resources does not exceed 100% utilization.

Now, most customers are more concerned with high availability then performance.  They are willing to accept degraded performance in the event of a node failure for a short period of time until they resolve the issues.  They are just concerned with users being up and able to access their data.  In this case, maybe the maximum load you would put on any one node might be 75%.  Then in the event of a failover, the surviving node operates at 150%.  It still services requests, but at a diminished performance.

Now for the most mission critical and performance sensitive implementations, maybe an Active/Passive deployment makes the most sense.  Where there is no performance degradation of any kind in the event of a failover.

You could also scale up the number of nodes.  So do a 3-node Active/Active/Passive configuration.  In this configuration the loss of a single node would have no performance degradation to users.  But the very rare scenario where you lose 2 nodes would then have reduced performance.  By scaling up the number of nodes you can achieve better results for the most common failures.

The implementation varies depending on the business needs, but the capacity of a single node is the same as a stand-alone server.  Clusters are unique that load dynamically shifts, and you need to plan accordingly