Advantages of DAS over SAN storage in Exchange 2007

Published 17 April 08 10:16 AM

You may have heard me talk about this at TechEd, and some of my customers will recall long discussions with me about the merits of using Direct Attached Storage rather than SANs in Exchange 2007.

Storage Area Networks (SAN) have become increasingly common in Enterprise environments in recent years, and in particular for hosting Exchange Server environments due to their space and IO requirements.

There are many advantages to using a SAN with Exchange:

  • Performance - Exchange 2003 is a very IO intensive application, and in any decent-sized environment disk IO is almost always the bottleneck
  • Scalability - we're storing more data than ever before and although technologies like Sharepoint relieve some of the burden from Exchange, the fact is that mailbox sizes continue to grow as we move into the age of UM and UC.
  • Availability - consolidating your storage provides a single system to maintain and backup, and allows you to build HA/DR into the storage system to protect all of your data.

However Exchange 2007 introduced some new concepts, which has changed things somewhat.

  • Performance - by moving to a 64-bit platform (which does not have the virtual memory limitations of its 32-bit predecessors) we have significantly reduced the IO impact in Exchange, by about 70% on a mailbox server!  Modern DAS technologies such as Serial Attached SCSI (SAS) are capable of providing more than enough IO for Exchange 2007.
  • Scalability - SANs are expensive, so as storage requirements increase so does the TCO of the environment.  For example, at Microsoft the high costs associated with shared SAN storage hindered MSIT from supporting employees with mailbox quotas of greater than 200 MB.
  • Availability - SANs don't fail often, but they do fail - and when they do it can be catastrophic.  Three of my Exchange 2003 customers have had SAN failures this year, and in each case it took several days to return Exchange to full operation.  Two of these customers were using SAN replication technology which actually replicated corrupt data to the DR site, taking it out and making the restore process much more complex (and defeating the whole purpose of the replication IMHO).  Exchange 2007 introduced CCR and SCR, which provide software-based replication for HA and DR eliminating single points of failure, and providing simplified site resilience and DR.

 

Exchange 2007 SAN
Figure 1: a typical Exchange 2003 2-node cluster using SAN replication for DR


 Exchange 2007 DAS _small
Figure 2: Exchange 2007 - CCR+SCR, using DAS storage

The two diagrams above, which are scanned from some scribblings on paper I drew in a recent Architectural Design Session with a customer (apologies for the messiness - I'm so used to keyboard/mouse than my handwriting skills have declined :)), show their existing Exchange 2003 cluster and a proposed Exchange 2007 cluster.

The advantages of the Exchange 2007 environment here are:

  • Increased availability in Prod datacentre - there are two copies of the database, with automatic failover, meaning no single point of failure.  The SAN environment in Figure 1 has only one copy of the DB in Prod, meaning that any DB failure will result in a cutover to DR.  CCR in Figure 2 provides fault tolerance in the event of server failure, storage failure and data corruption.
  • Enhanced Disaster Recovery - using SCR to replicate data to the DR site, there is now a fully supported and simple to manage DR process in place.  In Figure 1, data corruption in Prod could be replicated to DR, meaning that restore from backup would be required.  In Figure 2, SCR will not replicate corrupted data as it used transaction log shipping to replicate.
  • Reduced TCO - there may be more copies (3) of the data in Figure 2, but the usage of DAS disks will significantly reduce the cost of storage which will likely more than offset the cost of the additional disks required.

 

So how does it work in the real world?  A great example of the benefits of moving from SAN to DAS is our internal MSIT case study.

We saw benefits such as:

  • Messaging service levels exceeding high-availability targets of 99.99 percent.
  • Cost reductions in excess of $10 million per year.
  • Increased mailbox quotas by up to a factor of 10.
  • Consolidation of the initial Exchange Server 2007 base by nearly a factor of two.

The MSIT whitepaper can be found at the Technet site, in Word format, as a webcast, or as a podcast (WMA, MP3).

 

Share this post :

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# scr data said on April 26, 2008 3:24 AM:

PingBack from http://demarion.cheapnewssite.com/scrdata.html

# Brian said on August 7, 2008 3:49 PM:

Hi Johnann,

the point about "Cost reductions in excess of $10 million per year" were driven mostly through reduction of TAPE, while your total terabytes and number of servers increased as compared to the consolidated architecture common with Exchange 2003.  

Is this a marketing blog, or a technical blog?

# jkruse said on August 7, 2008 6:05 PM:

Hi Brian,

My blog is for a technical audience, but you’ll find that a lot of what I talk about has a business focus.

Technical knowledge is all well and good but you need a business reason to drive technical projects, and that’s a large part of what my role at Microsoft is – assisting to Enterprise customers to translate technical capabilities into business value, a good example being a reduction in storage costs with Exchange 2007.

You’re correct that in the linked article over $5-million out of the $10-million in cost savings is due to elimination of tape backups, but that means there is still $4-5-million in further cost-savings that are not due to tapes.

There are some additional points I think you’ve overlooked:

• CCR and DPM are the technical enablers of this tape elimination, while SAS disks (DAS) are the cost (business) enabler.  Without moving to DAS it would not have been feasible to implement CCR/DPM, and thereby eliminate tapes.

• You mention that our “total terabytes” increased.  Exactly – that is the whole point!  By using DAS rather than SANs we were able to *increase* our mailbox quotas by *10-times* at a *lower cost* than before.

• You mention that our “number of servers” increased.  Sorry but I must have missed that but – my reading of the article suggests that we reduced from 62 Mailbox servers (124 physical cluster nodes) to 34 Mailbox servers (68 physical cluster nodes)?

• That said – CCR means that we have 2 Mailbox servers (Active/passive) for every cluster, each of which has a copy of the data.  But again the point is that we increased our quotas by 10-times, and when you factor in CCR that really means we are storing 20-times as much data, but again this is at a lower cost than before.

• The only way we were able to achieve this was by moving to DAS – storing 20-times the data on SANs may be technically possible, but would have been completely cost prohibitive.

Cheers,

Johann

# bob said on September 10, 2008 2:54 PM:

Interesting post.  I understand your point about MSIT but it was not SAN technology or protocols per se, but the choice of SAN (vendor) that drove the expenses so high and thus the 200MB quota.  

There are many extremely fast, extremely robust, yet cost-efficient SAN implementations that deliver everything you tout for DAS and then some.  I have used DAS, SAN and even NAS for Exchange, and there is no question that SAN is the way to go at any reasonable # of mailboxes for a business.  After all, the purpose of SAN is to provide shared storage, not siloed.  

# jkruse said on September 10, 2008 6:06 PM:

Hi Bob,

That is a good point - the cost of storage is of course going to vary from vendor to vendor, and this is specific to the prices that we were able to get in MSIT.

Personally I have yet to see a SAN environment that is anywhere near as cheap or cost efficient as DAS for Exchange, but YMMV.

SANs are all about shared storage environments, but best practice for Exchange is to put it on dedicated storage (spindles etc) within the SAN, which is really creating a silo within the SAN.  Here's an out-take from the linked case study:

"    Concerns that DAS would create storage silos and hidden operational costs   Another obstacle that prevented Microsoft IT from initially seeing CCR on DAS as a viable solution for Mailbox servers was the fact that DAS attaches directly to each cluster node, which creates individual storage silos. From a SAN point of view, it is an overwhelming proposition to create a large number of individual storage locations in the corporate messaging environment. In a SAN environment, ongoing costs for storage allocation, capacity management, performance management, and troubleshooting can quickly exceed the initial investment in hardware and installation. By assuming that this issue of hidden ongoing costs would also apply to DAS, Microsoft IT saw any initial DAS savings potential dwindle rapidly. Today, with the benefit of operating for more than 18 months of CCR on DAS in production, it is easy to say that DAS storage is "designed once and never touched again." However, in early 2006, Microsoft IT was unable to verify that there is truly no need for DAS capacity and performance management beyond the initial storage design. Replacing broken disks, cables, or redundant array of independent disks (RAID) controllers is merely a part of standard hardware maintenance. Downtime due to storage or other node failures is less than two minutes of failover time in a properly designed, CCR-based Mailbox server, and data loss is greatly reduced due to redundant copies of messaging databases on individual cluster nodes. In fact, when CCR on DAS is compared with shared-storage clusters on SAN, it is noticeable that there is less chance for data loss and less need for database restores from backup because CCR eliminates the data instance used by the active node as a critical single point of failure. CCR on DAS also does not create new storage silos. It merely moves the existing storage silos—which dedicated, exclusive Exchange Server storage represents in a shared SAN environment—out of the high-maintenance, high-cost environment into a low-maintenance, low-cost alternative.   "

And of course another major driver for us moving to DAS was an extended outage caused by SAN failure:

"   a SAN storage array failure occurred, taking down multiple Mailbox servers, and causing an outage and the loss of 8,000 production mailboxes. It took three days to bring the systems back online, and the worst news was yet to come. Through a combination of unfortunate circumstances, the most recent tape-based backups were also irretrievably lost. Microsoft IT was unable to restore the most recent data, and 8,000 users, including employees, partners, contractors, and vendors lost e-mail data. It was a horrible week for Microsoft IT and the Exchange Server product group alike. It showed not only the critical nature of shared storage as a single point of failure in the Mailbox server architecture, but also the vulnerability of an IT organization if it has to depend on tape-based backups as the primary means to recover from storage failures.   "

CCR gives us hardware redundancy (e.g. we are still using RAID), data redundancy (two verified copies of the database, automatic failover), and with SCR a 3rd copy of the data with full site/datacentre redundancy... all out-of-the-box at no extra cost aside from the extra servers/disks.

# anupama said on December 2, 2008 12:22 AM:

hi

what u given differs from heading

its quite opposite

# Johann's Unified Communications said on December 14, 2008 4:46 AM:

I have many interesting conversations about my post on Advantages of DAS over SAN storage in Exchange

# David Vellante said on December 28, 2009 7:07 AM:

Interesting post. Thanks for the information.  

One thing I haven't seen discussed much that we've modeled in the wikibon community is the degree to which a SAN infrastructure can be leveraged across multiple applications.

Our TCO models show this issue of SAN vs DAS cost is not as black and white as many suggest. It appears that TCO is largely a function of how many applications a SAN can/will support over its useful life.

If you look at Exchange as a silo (which according to comments above is best practice from a storage allocation perspective) the TCO models we've built show that indeed DAS will be cheaper from a CAPEX and OPEX perspective...especially in smaller configurations.

However as you begin to scale and support more applications with a switched SAN infrastructure (even allowing for dedicated storage allocation for Exchange from the SAN as is suggested above-- i.e. a 'silo' within the SAN) SAN costs will decline dramatically-- and from a return on assets perspective often be a superior cost choice relative to DAS (from both a capex and opex standpoint)

It's important for CIOs to understand this dynamic. If you look at Exchange in isolation, costs will often be lower with DAS. But if you think about the asset leverage from an installed base over 5 years you will often find that SAN is less expensive from both a capex and opex standpoint-- similar to a factory utilization analogy in manufacturing.

The critical factor is how much leverage can I get across my application portfolio with a "more expensive" switched SAN infrastructure?

So my suggestion (with all due respect) is don't just blindly follow the advice of Microsoft or that of a SAN vendor. Do some homework and understand the dynamics of cost for YOUR SPECIFIC situation.

Leave a Comment

(required) 
(optional)
(required) 

  
Enter Code Here: Required

This Blog

Syndication

Page view tracker