EDIT: This post has been updated on 5/6/10 for the new version of storage calculator. For the list of latest major changes, please see THIS.

In order to assist customers in designing their storage layout for Exchange 2007, we have put together a calculator that focuses on driving the storage requirements (I/O performance and capacity) and what the optimal LUN layout should be based on a set of input factors.

The calculator uses all the recommendations outlined in the following articles, and thus we recommend you read them before utilizing the calculator:

The calculator is broken out into the following sections (worksheets):

  • Input
  • Storage Requirements
  • LUN Requirements
  • Backup Requirements
  • Log Replication Requirements
  • Storage Design

Important: The data points provided in the calculator are an example configuration. As such any data points entered into the Input worksheet are specific to that particular configuration and do not apply for other configurations. Please ensure you are using the correct data points for your design.

Important: Changes made to the calculator in the v11.x time frame have resulted in the calculator no longer being backwards compatible with older versions of Excel. If you do not have Excel 2007 and would like to use the calculator, please download VirtualPC 2007 (http://www.microsoft.com/downloads/details.aspx?FamilyID=04d26402-3199-48a3-afa2-2dc0b40a73b6&DisplayLang=en) and the Office 2007 VHD (http://www.microsoft.com/downloads/details.aspx?FamilyID=f9956176-cf66-478b-b20d-b9b92dd0dbfa&DisplayLang=en).

Input

This section is where you enter in all the relevant information regarding your design, so that the calculator can generate what you need in order to achieve your design.

Note: There are many input factors that need to be accounted for before you can design your solution. Each input factor is briefly listed below; there are additional notes within the calculator that explain them in more detail.

Step 1 - Server Configuration

 

Exchange Server Configuration
  1. Which version of Exchange 2007 are you using?  Depending on the version you select (RTM or SP1+), there is a different amount of RAM required per storage group.  Exchange 2007 SP1 and later requires less RAM per storage group, than the RTM version does due to changes in the Jet architecture.
  2. How many mailbox servers are you going to deploy?  If you enter more than a single server, the calculator will evenly distribute the user mailboxes across the total number of mailbox servers and make performance and capacity recommendations for each server, as well as, for the entire environment.
  3. What High Availability configuration are you deploying? You can select none, LCR, CCR, or SCC.
  4. Are you using Content Indexing? By default this is enabled in Exchange 2007 and requires an additional 5% capacity per database for each storage group LUN.
  5. Are you going to deploy a Dedicated Restore LUN? A dedicated restore LUN is used as a staging point for the restoration of data or could be used during maintenance activities; if one is selected then additional capacity will not be factored into each database LUN.
  6. What percentage of disk space do you want to ensure remains free on the LUN? Most operations management programs have capacity thresholds that alert when a LUN is more than 80% utilized. This value allows you to ensure that each LUN has a certain percentage of disk space available so that the LUN is not designed and implemented at maximum capacity.
Exchange Data Configuration
  1. What will be the deleted item retention? By default in Exchange 2007, the deleted item retention per database is 14 days.
  2. What will be the Data Overhead Factor? Microsoft recommends using 20% to account for any extraneous growth that may occur.
  3. How many mailboxes do you move per week? In terms of transactions, you have to take into account how many mailboxes you will either be moving to this server or within this server, as transactions will always get generated in the target storage group.
IOPS Configuration
  1. What will be the I/O Overhead Factor? Microsoft recommends using 20% to ensure adequate headroom in terms of I/O to allow for abnormal spikes in I/O that may occur from to time.
  2. Additional IOPS Requirements / Server? In other words, what additional I/O requirements do you need to factor into the solution for each mailbox server (e.g. certain third-party mobility products have additional I/O requirements that need to be factored into any design if they are being utilized)? This may require additional testing by comparing a baseline system against a system that has the I/O generating application installed and running.
Standby Continuous Replication Configuration
  1. Are you going to deploy Standby Continuous Replication (SCR) with this server? If so, choose the number of SCR targets you will have for each source mailbox server (note: if you do choose to have an SCR target, the calculator assumes that all storage groups on the source server will have an SCR target).
  2. What will be the SCR target's high availability configuration?  You can select either "Single-Node" or "Match Source Configuration".  If you choose single-node you are either  deploying 2-node source clusters (CCR or SCC) and want only a single node to be the standby cluster or you are deploying standalone mailbox servers (or LCR); the other option for "Single-Node" is if you are performing database portability instead of server recovery.  If you choose "Match Source Configuration" you are performing server recovery to retain the same level of availability as the source environment.
  3. What will be the SCR log replay delay? This parameter is used to specify the amount of time that the Microsoft Exchange Replication service should wait before replaying log files that have been copied to the SCR target computer. The default is 1 day (86400 seconds) and you can configure up to 7 days. Or you can disable log replay delay by setting the input to 0, in which case the replication service will delay the last 50 logs from being replayed into the SCR target database. The value you specify here will influence the log capacity requirements.
  4. What will be the SCR log truncation delay? This parameter is used to specify the amount of time that the Microsoft Exchange Replication service should wait before truncating log files that have been copied to the SCR target computer and replayed into the copy of the database. The time period begins after the log has been successfully replayed into the copy of the database. The maximum allowable setting for this value is 7 days. The minimum allowable setting is 0 seconds, although setting this value to 0 seconds effectively eliminates any delay in log truncation activity.
  5. What is your SCR Recovery Point Objective?  Enter in the recovery point objective (RPO) for which you are designing; this will help determine the log replication throughput requirements necessary for the SCR targets.
Database Configuration
  1. Do you want to follow Microsoft's recommendations regarding maximum database size? Microsoft recommends that the database size should not be more than 100GB in size when continuous replication is not in use and no more than 200GB when continuous replication is in use. This is by no means a hard limit, but a recommendation based on the impact database size has to recovery times. If you want to follow Microsoft's recommendation, then select Yes. Otherwise, select No.
  2. Do you want to specify a custom Maximum Database Size? If you selected No for the previous field, then you need to enter in a custom maximum database size.
Step 2 - Mailbox Configuration

The calculator provides the capability to design a storage solution that can support three different tiers (or classes) of mailbox users.

Mailbox Configuration
  1. How many mailboxes will you deploy on the server or in the environment? If deploying a single server environment, this is how many mailboxes you will deploy on this server.  If you are deploying multiple servers, then this is how many mailboxes you will deploy in the environment.  For example, if you choose to deploy 5 servers, and want 3000 mailboxes per server, then enter 15000 here.
  2. What is the solution's projected growth in terms of number of mailboxes over its lifecycle?  Enter in the total percentage by which you believe the number of mailboxes will grow during the solution's lifecycle.  For example, if you believe the solution will increase by 30% and you are starting out with 1000 mailboxes, then at the end of the lifecycle, the solution will have 1300 mailboxes.  The calculator will utilize the projected growth plus the number of mailboxes to ensure that the capacity and performance requirements can be sustained throughout the solution's lifecycle.
  3. How much mail do the users send and receive per day on average? The usage profiles found here are based on the work done around the memory and processor scalability requirements.
  4. What is the average message size? For most customers the average message size is around 50KB.
  5. What will be the prohibit send & receive mailbox size limit? If you want to adequately control your capacity requirements, you need to set a hard mailbox size limit (prohibit send and receive) for the majority of your users.
  6. Predict IOPS Value? This question asks whether you want to override the calculator in determining the IOPS / mailbox value. By default the calculator will predict the IOPS / mailbox value based on the number of messages per mailbox, the user memory profile, and in what Outlook mode the mailboxes are operating. For some customers that want to design toward a specific I/O profile, this option will not be viable. Therefore, if you want to design toward a specific I/O profile, select No to the "Override IOPS Calculation" question.
  7. Do you want to include an IOPS Multiplication Factor in the prediction or custom I/O profile? The IOPS Multiplication Factor can be used to increase the IOPS/mbx footprint for mailboxes that require additional I/O (for example, these mailboxes may use third-party mobile devices).  The way this value is used is as follows: (IOPS value * Multiplication Factor) = new IOPS value.
  8. IOPS / Mailbox? Only enter a value in this field if you selected "No" to the "Predict IOPS Value" question.
  9. What will be the database read:write ratio? Only adjust this value if you selected "No" to the "Predict IOPS Value" parameter. When IOPS prediction is enabled, the calculator will calculate the read:write ratio based on the message profile and the Outlook mode in use.
  10. In what Outlook mode will the majority of the clients operate? Select either Online or Cached Mode depending on how the majority of your users operate (>75%).
Client Configuration
  1. What will be the user concurrency? Typically most customers should design toward 100% concurrency.
Step 3 - Backup Configuration

Backup Configuration
  1. What backup methodology will be used to backup the solution?  Choose Hardware VSS Backup/Restore, Software VSS Backup/Restore, Streaming, or VSS Backup Only.  The backup methodology will affect the LUN design.
  2. What will be the backup frequency?  You can choose Daily Full, Weekly Full with Daily Differential, or Weekly Full with Daily Incremental.  The backup frequency will affect the LUN design and the disk space requirements (e.g. if performing daily differentials, then you need to account for 7 days of log generation in your capacity design).
  3. What is the streaming backup rate in MB/s for your environment? Enter in the rate at which you can backup your Exchange data when performing a streaming (online) backup.
  4. What is the streaming restore rate in MB/s for your environment?  Enter in the rate at which you can restore your Exchange data when performing a streaming (online) restore.
  5. How many times can you operate without log truncation? Select how many times you can survive without a full backup or an incremental backup. For example, if you are a performing weekly full backup and daily differential backups, the only time log truncation occurs is during the full backup. If the full backup fails, then you have to wait an entire week to perform another full backup or perform an emergency full backup. This parameter allows you to ensure that you have enough capacity to not have to perform an immediate full backup.
Step 4 - Replication Requirements

The data for this section will help determine the appropriate log bandwidth requirements for both geographically dispersed CCR and SCR configurations.

 

Log Replication Configuration
  1. How many transaction logs are generated for each hour in the day? Enter in the number of transaction logs that are generated for each hour in the day.

Now you may be wondering how you can collect this data. We've written a simple VBS script that will collect all files in a folder and output it to a log file. You can use Task Scheduler to execute this script at certain intervals in the day (e.g. every 15 minutes). Once you have generated the log file for a 24 hour period, you can import it into Excel, massage the data (i.e. remove duplicate entries) and determine how many logs are generated for each hour. If you do this for each storage group, you will be able to determine your log generation rate for each hour in the day. This script is named collectlogs.vbsrename (just rename it to collectlogs.vbs) and you can find it here:

Collectlogs VBS script

Network Configuration
  1. What type of network link will you be using between the servers?  Select the appropriate network link you will be using between the two nodes in the geographically dispersed cluster or between the SCR source and SCR targets.

  2. What is the latency on the network link? Enter in the latency (in milliseconds) that exists on the network link.

  3. How can you survive a network outage?  When a network outage occurs, log replication cannot occur.  As a result, the copy queue length will increase on the source; in addition, log truncation cannot occur on the source.  For geographically dispersed CCR or remote SCR deployments, network outages can seriously affect the solution's usefulness.  If the outage is too long, log capacity on the source may become compromised and as result, a manual log truncation event must occur.  Once that happens, the remote copies must be reseeded.  The Network Failure Tolerance parameter ensures there is enough capacity on the log LUNs to ensure that you can survive an excessive network outage. 

Storage Requirements

This section deals with outputting the I/O performance and capacity storage requirements based on the input factors entered into the calculator.

Calculations

The Calculations Pane performs all the calculations based on the input factors and outputs the key calculations into the Results Pane. For this blog, I will not delve into the specifics of the calculations, but feel free to review them within the calculator.

Results

Based on the above input factors the calculator will recommend the following settings.

Number of Servers & Data Copies

The Number of Servers and Data Copies table will provide you with

  • The Number of Mailbox Servers that will exist in your environment.  This value is based on the Number of Exchange Mailbox Servers that you entered in the Input section.
  • The Number of SCR Target Servers.  This value is based on the Number of Exchange Mailbox Servers and the Number of SCR Targets / Source Server that you entered in the Input section.  For SCR targets, it is assumed that if you are utilizing CCR as the source and you are matching the source HA configuration for the SCR target, then you will be replicating the storage groups to both nodes of the SCR standby cluster.
  • The Number of Data Copies value will tell you how many copies of the data you will have.  For example if you selected LCR or CCR, you will have at least 2 copies (you could have more if you specified a number of SCR targets).   For SCR targets, it is assumed that if you are utilizing CCR as the source and you are matching the source HA configuration for the SCR target, then you will be replicating the storage groups to both nodes of the SCR standby cluster.
User Mailbox Configuration

The Mailbox Configuration table will provide you with

  • The Number of Mailboxes that you entered in the Input section (this value will include the projected growth).
  • The Mailbox Size is the actual mailbox size on disk that factors in the prohibit send/receive limit, the number of messages the user sends/receives per day, the deleted item retention window, and the average database daily churn per mailbox. It is important to note that the Mailbox size on disk is actually higher than your mailbox size limit; this is to be expected.
  • The Database Cache / Mailbox value is the necessary amount of RAM per mailbox that is needed to increase the database cache so that the number of database reads can be reduced.
  • The Transaction Logs Generated / Mailbox value is based on the message profile selected and the average message size and indicates how many transaction logs will be generated per mailbox.
  • The IOPS / Mailbox value is either the calculated IOPS / Mailbox value that is based on the number of messages per mailbox, the user memory profile, in what Outlook mode the mailboxes are operating. If you had chosen to enter in a specific IOPS / mailbox value rather than allowing the calculator determining the value based on the above requirements, then this value will be that custom value.
  • The Read:Write ratio / Mailbox value defines the percentage of the mailbox's IOPS profiles that are read I/Os.  This information is required to accurately design the storage subsystem I/O requirements.
Solution Configuration

The Solution Configuration table will provide you with

  • The Recommended RAM Configuration for the mailbox server.  This is the amount of RAM needed to support the number of databases required, in addition to, the number of mailboxes based on their memory profile.
  • The Recommended Number of Databases is the calculated number of databases required to support the mailbox population. This number can be used with the DPM 2007 Storage Calculator.  Also, if you selected to have multiple mailbox servers, the Total for all Servers column, will output the total number of storage groups for all mailbox servers.
  • The Recommended Number of Mailboxes / Database is the calculated number of mailboxes per database ensuring that the database size does not go above the recommended size limit (for non-Continuous Replication (CR) systems 100GB, for CR systems 200GB).
  • The Number of Tier-x Mailboxes / Database provides a breakdown of how many mailboxes from each mailbox tier will be stored within a database.
  • The Total Number of Mailboxes outlines how many mailboxes will reside on each server, as well as, the total number of mailboxes that will exist within the environment if you selected to have multiple mailbox servers.
Transaction Log Requirements

The The Transaction Log Requirements table will provide you with

  • The User Transaction Logs Generated / Day indicates how many transaction logs will be generated during the day for the server. The Total for all Servers column outputs the total number of user transaction logs generated across all mailbox servers.
  • The Average Mailbox Move Transaction Logs Generated / Day indicates how many transaction logs will be generated during the day for the server. This number is an assumption and assumes that an equal percentage of mailboxes will be moved each day, as opposed to moving all mailboxes on the same day. The Total for all Servers column outputs the average move mailbox transaction logs generated across all mailbox servers.
  • The Average Transaction Logs Generated / Day is the total number of transaction logs that are generated per day on the server (includes user generated logs and mailbox move generated logs). The Total for all Servers column outputs the average number of transaction logs generated across all mailbox servers.
  • The User Transaction Logs Generated / SG / Day indicates how many transaction logs will be generated during the day for each storage group.
  • The Average Mailbox Move Transaction Logs Generated / SG / Day indicates how many transaction logs will be generated during the day for each storage group. This number is an assumption and assumes that an equal percentage of mailboxes will be moved each day, as opposed to moving all mailboxes on the same day.
  • The Average Transaction Logs Generated / SG / Day is the total number of transaction logs that are generated per day for a storage group on the server (includes user generated logs and mailbox move generated logs). This number can be used with the DPM 2007 Storage Calculator.
Disk Space & Performance Requirements

The Disk Space & Performance Requirements table will provide you with

  • The Database Space Required / Replica is the amount of space required to support the database infrastructure for each replica that exists. This value is derived from the mailbox size on disk, the data overhead factor, whether a dedicated restore LUN is available, and the use of content indexing. The Total for all Mailbox Servers column outputs the total database disk space required for all mailbox servers. The Total for all SCR Servers column outputs the total database disk space required for all SCR target servers.
  • The Log Space Required / Replica is the amount of space required to support the log infrastructure for each replica that exists. This value takes into account the number of mailboxes moved per week (assumes worst case and that all mailboxes are moved on the same day), the type of backup frequency in use, the number of days that can be tolerated without log truncation and the number of transaction logs generated per day. This number can be used with the DPM 2007 Storage Calculator. The Total for all Mailbox Servers column outputs the total log disk space required for all mailbox servers. The Total for all SCR Servers column outputs the total log disk space required for all SCR target servers.
  • The Database LUN Space Required / Replica is the LUN size required to support the database infrastructure for each replica that exists. The Total for all Mailbox Servers column outputs the database LUN disk space required for all mailbox servers. The Total for all SCR Servers column outputs the database LUN disk space required for all SCR target servers.
  • The Log LUN Space Required / Replica is the LUN size required to support the log infrastructure for each replica that exists.  The Total for all Mailbox Servers column outputs the log LUN disk space required for all mailbox servers.  The Total for all SCR Servers column outputs the total Restore LUN disk space required for all SCR target servers..
  • The Restore LUN Size / Node (and / SCR Targets) is the amount of space needed to support a restore LUN if the option was selected in the Input Factor section; this will include space for up to 7 databases and 7 transaction log sets. If CCR is chosen as the continuous replication solution, then a Restore LUN will be provisioned for each node in the cluster. If there are SCR targets, then you will also need to provision a restore LUN on each SCR target server. The Total for all Mailbox Servers column outputs the log LUN disk space required for all mailbox servers. The Total for all SCR Servers column outputs the total Restore LUN disk space required for all SCR target servers.
  • The Total Required Database IOPS is the amount of read and write host I/O the database disk set must sustain during peak load. The Total for all Mailbox Servers column outputs the total database IOPS required for all mailbox servers. The Total for all SCR Servers column outputs the total Restore LUN disk space required for all SCR target servers.
  • The Total Required Log IOPS is the amount of read and write host I/O that will occur against the transaction log disk set. The Total for all Mailbox Servers column outputs the total database IOPS required for all mailbox servers. The Total for all SCR Servers column outputs the total Restore LUN disk space required for all SCR target servers.
  • The Database Read I/O Percentage defines the percentage of database required IOPS that are read I/Os.  This information is required to accurately design the storage subsystem I/O requirements.

LUN Requirements

The LUN Requirements section is really a continuation of the Storage Requirements section. It outlines what we believe is the appropriate LUN design based on the input factors and the analysis performed in the previous section.

Note: The term LUN utilized in the calculator refers only the representation of the disk that is exposed to the host operating system. It does not define the disk configuration.

LUN Design

The LUN Design highlights the LUN architecture chosen for this server solution. The architecture is derived from the backup type and frequency that was chosen in the Storage Requirements section.

  • If you selected to perform a weekly full backup and are not using hardware-based VSS as a backup solution, then we will recommend the 2 LUNs / Backup Set approach. This approach places the storage group backup set on the same log and db LUN. This can reduce the number of LUNs on the server. For example, if you have 14 databases, the calculator will recommend that SG1-7 be grouped together on 2 LUNs; this becomes the backup set. SG8-14 will be grouped together on another 2 LUNs to become a second backup set.
  • If you selected to use hardware VSS as a backup method or are performing daily full backups, then we recommend the 2 LUNs / Storage Group approach. This approach places each storage group set on its own set of LUNs.
LUN Configuration

The LUN Configuration table highlights the number of databases that should be placed on a single LUN. This is derived from LUN Architecture model.

This section also documents how many LUNs will be required for the entire solution, broken out by Database and Log sets (remember continuous replication will require an additional number of LUNs), and the number of restore LUNs for both the source, replica, and SCR targets.

Database Configuration

The Database Configuration table outlines how many databases are required, the number of mailboxes per database, the size of each database, and the transaction log size required for each database.

SG LUN Design

The SG LUN Design table outlines the physical LUN layout and follows the recommended number of storage groups per LUN approach based on the LUN Architecture model. It also documents the LUN size required to support layout (this is where we factor in the additional capacity for content indexing, the LUN Free Space Percentage, and whether you are using a Restore LUN), as well as the transaction log LUN.

Backup Requirements

The Backup Requirements section is really a continuation of the Storage Requirements section. It outlines what we believe is the appropriate backup design based on the input factors and the analysis performed in the previous sections.

Streaming Backup Window Requirements

If you selected to utilize a streaming backup methodology, then the Streaming Backup Window Requirements section will provide you with:

  • The Full Backup Window / SG is the amount of time it will take to back up a single storage group utilizing a streaming backup application by taking into account the calculated database size and the backup rate. You should validate this metric against your Service Level Agreements to determine if it is acceptable.
  • The Incremental or Differential backup Window / SG is the amount of time it will take to perform an incremental or differential streaming backup for a single storage group and is based on the number of transaction logs that are generated per day and the backup rate.
Streaming Restore Window Requirements

If you selected to utilize a streaming backup methodology, then the Streaming Restore Window Requirements section will provide you with:

  • The Full Restore Window / SG is the amount of time it will take to restore a single storage group utilizing a streaming restore process by taking into account the calculated database size and the restore rate. You should validate this metric against your Service Level Agreements to determine if it is acceptable.
  • The Incremental or Differential Restore Window / SG is the amount of time it will take to perform an incremental or differential streaming restore for a single storage group and is based on the number of transaction logs that are generated per day and the restore rate.
Backup Configuration

The Backup Configuration table outlines the number of databases that will be placed within a single LUN and the type of backup methodology and frequency in which the backups will occur.

Backup Frequency Configuration

The Backup Frequency Configuration section will provide you with an outline on how you should perform the backups for each server, utilizing either a daily full backup or weekly full backup frequency.

Log Replication Requirements

The Log Replication Requirements section is another continuation of the Storage Requirements section. It outlines what we believe is the throughput required to replicate the transaction logs for SCR targets or a geographically dispersed CCR scenario. Please note that if you selected to have multiple mailbox servers, then the data outputted in this section represents all mailbox servers.

 

Log Replication Throughput Requirements

The Log Replication Throughput Requirements table will provide you with

  • The Transaction Logs Generated / Day is the amount of logs that will be generated for the entire day (includes user generated logs and mailbox move generated logs).
  • The Geographically Dispersed CCR Throughput Required / CMS is the throughput required to sustain a single geographically dispersed cluster's log generation.  This value is based on the peak log generation hour.
  • The Geographically Dispersed CCR Throughput Required is the throughput required to sustain all geographically dispersed clusters' log generation. This value is based on the peak log generation hour.
  • The SCR Throughput Required SCR Target / Source is the throughput required to sustain log replication to a single SCR target from a single source mailbox server.  This value is based on the recovery point objective.  This model does not assume that the peak hours are contiguous.  The effect is that you can modify this to have peak hours at, 8am and 4pm, and the resulting bandwidth requirement will assume that you can take the time in between 8 and 4 to catch up within the specified RPO.
  • The Total SCR Throughput Required is the total throughput required to sustain log replication to all SCR targets from all source mailbox servers.

Chosen Network Link Suitability

The Chosen Network Link Suitability table will dictate whether the chosen network link has sufficient capacity to sustain geographically dispersed CCR replication and/or SCR replication.   If the network link cannot sustain the log replication traffic, then you will need to either upgrade the network link to the recommended network link throughput, or adjust the design appropriately.

Recommended Network Link

The Recommended Network Link table recommends an appropriate network link if the chosen network link does not have sufficient capacity to sustain log replication for geographically dispersed CCR and SCR solutions. 

TCP/IP Settings for Geographically Dispersed CCR

The TCP/IP Settings for Geographically Dispersed CCR table outlines the custom TCPWindowSize and TCP1323Opts values you should deploy on the source and target server (assuming both source and target are Windows Server 2003) to improve the number of logs that can be replicated per second.  This value is determined based on the network link (either the chosen network link if it is acceptable, or the recommended network link) and its latency.

TCP/IP Settings for SCR

The TCP/IP Settings for SCR table outlines the custom TCPWindowSize and TCP1323Opts values you should deploy on the source and target server (assuming both source and target are Windows Server 2003) to improve the number of logs that can be replicated per second.  This value is determined based on the network link (either the chosen network link if it is acceptable, or the recommended network link) and its latency.  Please note that in the SCR target replication scenario, the recommendation assumes that all SCR targets will replicate over the same network link.

Note: the Network Link recommendations do not take into account database seeding or any other data that may also utilize the link.

Storage Design

The Storage Design worksheet is designed to take the data collected from the Input worksheet and Storage Requirements worksheet and help you determine the number of physical disks needed to support the databases, transaction logs, and Restore LUN configurations.

Storage Design Input Factors

In order to determine the physical disk requirements, you must enter in some basic information about your storage solution.

Step 1 - RAID Configuration

RAID Parity Configuration

For the RAID Parity Configuration table you need to select the type of building block your storage solution utilizes.  For example, some storage vendors build the underlying storage in sets of data+parity (d+p) groups.  A RAID-5 3+1 configuration means that 3 disks will be used for capacity and 1 disk will be used for parity, even though parity is distributed across all the disks.  So if you had a capacity requirement that would utilize 15 disks, then you would need to deploy 5 3+1 groups to build that RAID-5 array.

  1. RAID-1/0 supports 1d+1p, 2d+2p, and 4d+4p groupings
  2. RAID-5 supports 3d+1p through 20d+1p groupings (though storage solutions could support more than that).
  3. RAID-6 supports 6d+2p groupings.

RAID Rebuild Overhead

When a disk is lost, the disk needs to be replaced and rebuilt.  During this time, the performance of the RAID group is affected.  This impact as a result can affect user actions.  Therefore, to ensure that RAID rebuilds do not affect the overall performance of the mailbox server, Microsoft recommends that you should ensure sufficient overhead is provisioned into the performance calculations when designing for RAID parity.   Most RAID-1/0 implementations will suffer a 25% performance penalty during a rebuild.  Most RAID-5 and RAID-6 implementations will suffer a 50% performance penalty during a rebuild.

The calculator defaults with the following as Microsoft recommendations, but they are adjustable:

  • For RAID-1/0 implementations, ensure that you factor in an additional 35% performance overhead.
  • For RAID-5/RAID-6 implementations, ensure that you factor in an additional 100% performance overhead.

RAID Configuration

By default the calculator will recommend either RAID-1/0 or RAID-5 by evaluating capacity and I/O factors and determining which configuration utilizes the least amount of disks while satisfying the requirements.  If you would like to override this and force the calculator to utilize a particular RAID configuration (e.g., RAID-0 or RAID-6), select "Yes" to this option and then select the appropriate RAID configuration in the cell labeled "Desired RAID Configuration."

By default the calculator utilizes RAID-5 for the Restore LUN.  However, you can define a specific RAID configuration for the Restore LUN.

Step 2 - Disk Selection

In this section you can select the appropriate disk capacity and disk type that you will want to utilize for your databases, transaction logs, and Restore LUN disks. 

The storage calculator allows you to select up to three different disk configuration scenarios, which allows you to perform comparisons.  The calculator will then run through the possible iterations and choose an appropriate configuration that ensures that both capacity and performance metrics are met while utilizing the least amount of physical disks. However, please keep in mind that the calculator does not take into account other factors that should be considered when evaluating different storage solutions like cost per disk, power consumption per disk, additional hardware (e.g., storage controller, disk enclosures) and software costs, and operational management costs.

Storage Design Calculations

The Calculations Pane performs all the calculations based on the input factors and outputs the key calculations into the Results Pane.  For this blog, I will not delve into the specifics of the calculations, but feel free to review them within the calculator.

Storage Design Results

The Storage Design Results section outputs the recommended configuration for the solution.  The recommendations made are the following:

  • Optimum RAID configuration
  • Optimum number of disks for databases, transaction logs, and Restore LUN

RAID Configurations

The RAID Configurations Table outlines the number of disks required and the RAID configuration that should be used for each disk configuration that you previously had selected in the Input section.

Recommended RAID Configuration / Server

This table recommends the optimum configuration for each mailbox server ensuring that performance and capacity requirements are met in the design.  If multiple disk types and capacities configurations were originally selected, then each configuration will be compared and the disk / RAID option that utilizes the least number of disks (while ensuring the performance and capacity requirements are met) will be recommended.

Storage Configuration

This table will output the total number of disks required for each mailbox server (for both source and replica instances in the LCR/CCR scenario) and its respective SCR targets.  It will also identify the total number of disks required to support the entire environment.

Conclusion

Hopefully you will find this calculator invaluable in helping to determine your storage requirements for Exchange 2007 mailbox servers. If you have any questions or suggestions, please email strgcalc AT microsoft DOT com.

Important

Changes made to the calculator in the v11.x time frame have resulted in the calculator no longer being backwards compatible with older versions of Excel.  If you do not have Excel 2007 and would like to use the calculator, you can get the trial here.

For the calculator itself, please see the following link:

Exchange 2007 Mailbox Server Role Storage Requirements Calculator spreadsheet

- Ross Smith IV