Follow us on Twitter
Follow us on YouTube
Would you like to suggest a topic for the Exchange team to blog about? Send suggestions to us.
EDIT: This post has been updated on 5/6/10 for the new version of storage calculator. For the list of latest major changes, please see THIS.
In order to assist customers in designing their storage layout for Exchange 2007, we have put together a calculator that focuses on driving the storage requirements (I/O performance and capacity) and what the optimal LUN layout should be based on a set of input factors.
The calculator uses all the recommendations outlined in the following articles, and thus we recommend you read them before utilizing the calculator:
The calculator is broken out into the following sections (worksheets):
Important: The data points provided in the calculator are an example configuration. As such any data points entered into the Input worksheet are specific to that particular configuration and do not apply for other configurations. Please ensure you are using the correct data points for your design.
Important: Changes made to the calculator in the v11.x time frame have resulted in the calculator no longer being backwards compatible with older versions of Excel. If you do not have Excel 2007 and would like to use the calculator, please download VirtualPC 2007 (http://www.microsoft.com/downloads/details.aspx?FamilyID=04d26402-3199-48a3-afa2-2dc0b40a73b6&DisplayLang=en) and the Office 2007 VHD (http://www.microsoft.com/downloads/details.aspx?FamilyID=f9956176-cf66-478b-b20d-b9b92dd0dbfa&DisplayLang=en).
This section is where you enter in all the relevant information regarding your design, so that the calculator can generate what you need in order to achieve your design.
Note: There are many input factors that need to be accounted for before you can design your solution. Each input factor is briefly listed below; there are additional notes within the calculator that explain them in more detail.
The calculator provides the capability to design a storage solution that can support three different tiers (or classes) of mailbox users.
The data for this section will help determine the appropriate log bandwidth requirements for both geographically dispersed CCR and SCR configurations.
Now you may be wondering how you can collect this data. We've written a simple VBS script that will collect all files in a folder and output it to a log file. You can use Task Scheduler to execute this script at certain intervals in the day (e.g. every 15 minutes). Once you have generated the log file for a 24 hour period, you can import it into Excel, massage the data (i.e. remove duplicate entries) and determine how many logs are generated for each hour. If you do this for each storage group, you will be able to determine your log generation rate for each hour in the day. This script is named collectlogs.vbsrename (just rename it to collectlogs.vbs) and you can find it here:
Collectlogs VBS script
What type of network link will you be using between the servers? Select the appropriate network link you will be using between the two nodes in the geographically dispersed cluster or between the SCR source and SCR targets.
What is the latency on the network link? Enter in the latency (in milliseconds) that exists on the network link.
How can you survive a network outage? When a network outage occurs, log replication cannot occur. As a result, the copy queue length will increase on the source; in addition, log truncation cannot occur on the source. For geographically dispersed CCR or remote SCR deployments, network outages can seriously affect the solution's usefulness. If the outage is too long, log capacity on the source may become compromised and as result, a manual log truncation event must occur. Once that happens, the remote copies must be reseeded. The Network Failure Tolerance parameter ensures there is enough capacity on the log LUNs to ensure that you can survive an excessive network outage.
This section deals with outputting the I/O performance and capacity storage requirements based on the input factors entered into the calculator.
The Calculations Pane performs all the calculations based on the input factors and outputs the key calculations into the Results Pane. For this blog, I will not delve into the specifics of the calculations, but feel free to review them within the calculator.
Based on the above input factors the calculator will recommend the following settings.
The Number of Servers and Data Copies table will provide you with
The Mailbox Configuration table will provide you with
The Solution Configuration table will provide you with
The The Transaction Log Requirements table will provide you with
The Disk Space & Performance Requirements table will provide you with
LUN Requirements
The LUN Requirements section is really a continuation of the Storage Requirements section. It outlines what we believe is the appropriate LUN design based on the input factors and the analysis performed in the previous section.
Note: The term LUN utilized in the calculator refers only the representation of the disk that is exposed to the host operating system. It does not define the disk configuration.
The LUN Design highlights the LUN architecture chosen for this server solution. The architecture is derived from the backup type and frequency that was chosen in the Storage Requirements section.
The LUN Configuration table highlights the number of databases that should be placed on a single LUN. This is derived from LUN Architecture model.
This section also documents how many LUNs will be required for the entire solution, broken out by Database and Log sets (remember continuous replication will require an additional number of LUNs), and the number of restore LUNs for both the source, replica, and SCR targets.
The Database Configuration table outlines how many databases are required, the number of mailboxes per database, the size of each database, and the transaction log size required for each database.
The SG LUN Design table outlines the physical LUN layout and follows the recommended number of storage groups per LUN approach based on the LUN Architecture model. It also documents the LUN size required to support layout (this is where we factor in the additional capacity for content indexing, the LUN Free Space Percentage, and whether you are using a Restore LUN), as well as the transaction log LUN.
The Backup Requirements section is really a continuation of the Storage Requirements section. It outlines what we believe is the appropriate backup design based on the input factors and the analysis performed in the previous sections.
If you selected to utilize a streaming backup methodology, then the Streaming Backup Window Requirements section will provide you with:
If you selected to utilize a streaming backup methodology, then the Streaming Restore Window Requirements section will provide you with:
The Backup Configuration table outlines the number of databases that will be placed within a single LUN and the type of backup methodology and frequency in which the backups will occur.
The Backup Frequency Configuration section will provide you with an outline on how you should perform the backups for each server, utilizing either a daily full backup or weekly full backup frequency.
The Log Replication Requirements section is another continuation of the Storage Requirements section. It outlines what we believe is the throughput required to replicate the transaction logs for SCR targets or a geographically dispersed CCR scenario. Please note that if you selected to have multiple mailbox servers, then the data outputted in this section represents all mailbox servers.
Log Replication Throughput Requirements
The Log Replication Throughput Requirements table will provide you with
Chosen Network Link Suitability
The Chosen Network Link Suitability table will dictate whether the chosen network link has sufficient capacity to sustain geographically dispersed CCR replication and/or SCR replication. If the network link cannot sustain the log replication traffic, then you will need to either upgrade the network link to the recommended network link throughput, or adjust the design appropriately.
Recommended Network Link
The Recommended Network Link table recommends an appropriate network link if the chosen network link does not have sufficient capacity to sustain log replication for geographically dispersed CCR and SCR solutions.
TCP/IP Settings for Geographically Dispersed CCR
The TCP/IP Settings for Geographically Dispersed CCR table outlines the custom TCPWindowSize and TCP1323Opts values you should deploy on the source and target server (assuming both source and target are Windows Server 2003) to improve the number of logs that can be replicated per second. This value is determined based on the network link (either the chosen network link if it is acceptable, or the recommended network link) and its latency.
TCP/IP Settings for SCR
The TCP/IP Settings for SCR table outlines the custom TCPWindowSize and TCP1323Opts values you should deploy on the source and target server (assuming both source and target are Windows Server 2003) to improve the number of logs that can be replicated per second. This value is determined based on the network link (either the chosen network link if it is acceptable, or the recommended network link) and its latency. Please note that in the SCR target replication scenario, the recommendation assumes that all SCR targets will replicate over the same network link.
Note: the Network Link recommendations do not take into account database seeding or any other data that may also utilize the link.
Storage Design
The Storage Design worksheet is designed to take the data collected from the Input worksheet and Storage Requirements worksheet and help you determine the number of physical disks needed to support the databases, transaction logs, and Restore LUN configurations.
Storage Design Input Factors
In order to determine the physical disk requirements, you must enter in some basic information about your storage solution.
Step 1 - RAID Configuration
RAID Parity Configuration
For the RAID Parity Configuration table you need to select the type of building block your storage solution utilizes. For example, some storage vendors build the underlying storage in sets of data+parity (d+p) groups. A RAID-5 3+1 configuration means that 3 disks will be used for capacity and 1 disk will be used for parity, even though parity is distributed across all the disks. So if you had a capacity requirement that would utilize 15 disks, then you would need to deploy 5 3+1 groups to build that RAID-5 array.
RAID Rebuild Overhead
When a disk is lost, the disk needs to be replaced and rebuilt. During this time, the performance of the RAID group is affected. This impact as a result can affect user actions. Therefore, to ensure that RAID rebuilds do not affect the overall performance of the mailbox server, Microsoft recommends that you should ensure sufficient overhead is provisioned into the performance calculations when designing for RAID parity. Most RAID-1/0 implementations will suffer a 25% performance penalty during a rebuild. Most RAID-5 and RAID-6 implementations will suffer a 50% performance penalty during a rebuild.
The calculator defaults with the following as Microsoft recommendations, but they are adjustable:
RAID Configuration
By default the calculator will recommend either RAID-1/0 or RAID-5 by evaluating capacity and I/O factors and determining which configuration utilizes the least amount of disks while satisfying the requirements. If you would like to override this and force the calculator to utilize a particular RAID configuration (e.g., RAID-0 or RAID-6), select "Yes" to this option and then select the appropriate RAID configuration in the cell labeled "Desired RAID Configuration."
By default the calculator utilizes RAID-5 for the Restore LUN. However, you can define a specific RAID configuration for the Restore LUN.
Step 2 - Disk Selection
In this section you can select the appropriate disk capacity and disk type that you will want to utilize for your databases, transaction logs, and Restore LUN disks.
The storage calculator allows you to select up to three different disk configuration scenarios, which allows you to perform comparisons. The calculator will then run through the possible iterations and choose an appropriate configuration that ensures that both capacity and performance metrics are met while utilizing the least amount of physical disks. However, please keep in mind that the calculator does not take into account other factors that should be considered when evaluating different storage solutions like cost per disk, power consumption per disk, additional hardware (e.g., storage controller, disk enclosures) and software costs, and operational management costs.
Storage Design Calculations
Storage Design Results
The Storage Design Results section outputs the recommended configuration for the solution. The recommendations made are the following:
RAID Configurations
The RAID Configurations Table outlines the number of disks required and the RAID configuration that should be used for each disk configuration that you previously had selected in the Input section.
Recommended RAID Configuration / Server
This table recommends the optimum configuration for each mailbox server ensuring that performance and capacity requirements are met in the design. If multiple disk types and capacities configurations were originally selected, then each configuration will be compared and the disk / RAID option that utilizes the least number of disks (while ensuring the performance and capacity requirements are met) will be recommended.
Storage Configuration
This table will output the total number of disks required for each mailbox server (for both source and replica instances in the LCR/CCR scenario) and its respective SCR targets. It will also identify the total number of disks required to support the entire environment.
Hopefully you will find this calculator invaluable in helping to determine your storage requirements for Exchange 2007 mailbox servers. If you have any questions or suggestions, please email strgcalc AT microsoft DOT com.
Important
Changes made to the calculator in the v11.x time frame have resulted in the calculator no longer being backwards compatible with older versions of Excel. If you do not have Excel 2007 and would like to use the calculator, you can get the trial here.
For the calculator itself, please see the following link:
Exchange 2007 Mailbox Server Role Storage Requirements Calculator spreadsheet
- Ross Smith IV