This blog business has been turning out to be more fun than I expected - maybe I should be nicer and less sarcastic when the marketing team suggests these ideas.
This time, answering the questions from my archiving blog and video have led into some areas which are a bit more controversial….. there are two key issues that we talk about in my latest video:
1) Tiered storage: is it a great way of reducing costs? 2) Stubbing approaches to archives: the world’s most elegant architecture or what?
Tiered storage is a conversation I have been having in a variety of different contexts with numerous storage experts for the past decade. In short, the idea that you can lower your costs by separating out your data into different tiers and putting that data on different types of hardware is only true under very restricted conditions. Given the mass storage hardware available for the past decade however, those conditions simply do not occur for content that needs to be, in any meaningful way, 'online'. So, if you are not thinking that one of your tiers of content is a bunch of data stuffed onto tapes that are squirreled away in a vault offsite and you expect to access only a tiny fraction of that content, then it is pretty clear that the lowest cost solution is to put all of your data, hot and cold together, onto the biggest cheapest drives you can find. End of story. I have been nibbling away at this theme a bit in previous posts but this is an attempt to make it as black and white as possible: tiered solutions for online data will increase your costs. It will make you less green, it will cost you more $$, it will make you less efficient and it will reduce the productivity of your users.
The other issue which was raised in response to the archiving video was related to stubbing – moving the bulk of the message out of Exchange and leaving only a link or ‘stub’ behind. I consider the stubbing approach it be one of those kluges that occur in the software industry regularly that are done out of necessity. In Exchange 2003 and earlier, to avoid the tyranny of tape back-up solutions for an online Exchange store, they were about the only thing many customers could do. Even so, a tiny fraction of customers ever deployed solutions based on this approach. Those that did were not very happy but realistically didn't have any alternatives. Now that there are real alternatives, the complexity, fragility, incompleteness and expense of stub solutions should make anyone thinking about deploying them pause and think for a very long time.
Interesting points, but doesn't address backup, which represents the majority of the lifetime cost associated with storage. If I can eliminate 90% of the data from my backups, then that represents a real cost savings in media and backup window. It also drastically improves achievable restore SLAs.
Stubbing - yes, it's a kludge.
But two tiers of data? There's another important reason for that. It's not cost. It's recovery time.
In a recovery scenario, the email I want back is the recent stuff. I might have a 5Gb mailbox. But in a recovery scenario, I'm really most concerned about getting back up and running with the most recent 100Mb. If can I divide my mail into "recent useful stuff" and "old stuff I may need very occasionally", I can get back the "recent useful stuff" faster.
Multiply that by 200 users in a storage group. Wrangling a terabyte of data can take a long time, especially if you have to do anything like checking consistency or even repairing the database. Or even if one step in the process breaks down and you have to start again. Why fight with a terabyte when my immediate needs are just 20Gb?
Hence everyone waiting for SP1.
The other thing missed by MS in this case is that I may want more copies (greater redundancy/HA) of my primary storage and less for my archive. Their cost saving argument assumes I have the same number of copies of my archive databases/mailboxes.
I totally agree with Perry's comments about stubbing.. We started going down that route with a 3rd party product and quickly realized what a mess it creates so we are going to deploy Exchange 2010 and use tiered archive storage once SP1 comes out.
I do disagree with his premise that the expensive tier 1 storage would we wasted. Although in larger environments SAN based raid groups might be dedicated to particular applications we often have different apps sharing raid groups as long as it does not create a performance issue. We have very little waste across our SAN.
I also agree with others comments about the SLAs for recovery. Although we will use DAGs groups with one passive copy recovery time is still a big issue not only for disaster recovery where we want the Tier 1 data back fast but also for things like restores to RSGs. If we have a true disaster which takes out our one and only data center when we will be recovering only the T1 data first and then going after the archive stores once all other critical systems are restored.
@koolhand The key point to recognize in replacing a backup strategy with a replication strategy is that recovery time stops being proportional to size of storage and becomes constant. It takes the same 30 seconds or so to activate your replica whether there is 2GB of data or 2TB of data in the database. Partitioning your data simply does not recover your data faster. Check out my blog post on Exchange Data Protection - the video covers this point specifically starting at 2:50.
@Paul Yes backup strategies are expensive. Don't do it. That is my whole point. Microsoft has not run backups on any of our internal email for years now because we have bet on our replication strategy and found that is provides more protection for a lot less money. The overall implementation covers the cases of: data protection in the face of multiple server and infrastructure failures; of users and administrators accidentally deleting data; of records retention for legal requirements (I doubt it comes as much of a surprise that our legal department has its fair share of discovery and legal hold activities) and of physical and logical corruption. I think my previous blog posts covers most of those cases in more detail “Exchange Data Protection” and “More Exchange Data Protection – Beyond Replication” .
If stubbing were part of Exchange's built-in archiving, then client add-ons would be uncessary as either the CAS or the Mailbox roles could still present the entire message to the client; it'd be transparent to the user.
The index and the "data" are already separate... only in on monolithic container (the jet database). Since most of the IO is due to the indexes and headers, if you were able to put them on separate storage, since they’re so small, you could put them on SSD (500GB of SSD in a SAN isn’t that expensive) and put the message bodies on FATA/SATA and you’d have great performance and really cheap storage.
Sure other vendors have implemented stubbing poorl (*cough* *cough* Symantec *cough*), but that hardly means that it cannot be done right (especially if it can be integrated into Exchange).
If you don't have access to another site or data centre to replicate too, then you are still stuck with shipping tapes offsite periodically. I think it's those folks who need to worry about recovery time in the event of a disaster. (Of course they are probably also the ones who could be considering moving it all to the cloud...) They'd want to be able to restore tier 1 data before the old archive stuff. But that doesn't mean that the data needs to be on different types of disks. The other thing to remember is that Exchange 2010 doesn't need very fast disks to begin with... slap some big RAM in your servers and chances are that the disks you were considering for your "tier 2 cheapo" storage is probably good enough for the whole thing. You can play with the storage calculator to see different combos.
What I do miss is Tiering beyond the current common SAN and DAS technology, no mention of SSD(see comment Jason). Plus getting really small mailboxes for your live data makes it possible to have less powerful mailbox servers. What I miss is any numbers on how the usage pattern of a dedicated set of archive servers and dedicated live mailbox servers can effect the numbers of mailboxes (for live data as they get smaller) or archives (as the performance hit is lower) per database. Then tiering does make sense, because you might be able to sell different RTO's (Koolhand comment) and buy hardware more specific to the requirements.
And there's nothing wrong with aggressive mandatory archiving rules, if it beats having costly quota's and .pst sprawl on fileservers (and backup) or data-loss.
@Jason, I don't think you are getting the core point about tiering and how it increases the cost.
I did cover the case of SSD's in my first video - blogs.technet.com/.../getting-the-conversation-started.aspx. But going through this might work better on paper.
SSD's are faster from a random IO perspective (and in particular on reads). However, they are very expensive on a per byte basis. For drives that you can install on a server they are on the order of 50 times the cost per byte of a near-line SATA drive. Worse than this, the projection on cost per byte is not converging over time.
Ok, so lets take your idea and put all the indexes and headers in our database on a set of SSD's and all the other data on the big slow drives. A conservative estimate is that indexes and headers account for over 10% of the total space used in our databases.
So lets take a specific case of 500 users with 2GB of space actually used for each user. Total logical space is 1TB. You could buy a single 2TB drive for each set of 500 users and another drive for each logical copy that you think you need to protect against hardware, network, datacenter and human error. Each of those near-line SATA drives is going to cost you a couple hundred dollars.
Your alternative is to replace the SATA drive with a drive that is 10% smaller (lets say we can get away with a 1.5TB drive instead of 2TB with a very generously estimated savings of 50$) and ADD another SSD drive for the index that is at least 100GB in size. Where do you think you can buy server-ready SSD's for less than 50$ for 100GB? You can't even buy MLC flash for that kind of price. Net you have increased your costs with tiering not reduced it.
This scales up of course. But the estimates for the case of 10,000 users each with 5GB of storage would elliminate a lot of rounding up that helped make the costs seem closer than they are.
Not all 3rd party Archiving solutions require software to be installed for the stubbing functionality to work. Also, most 3rd party Archiving solutions offer stubbing as an option, it is not mandatory.
For some of us, cost is not the issue. Our end users expectations are such that archiving wityout stubbing is a non-started, particularly when indexing has not occurred. MS appears to be trying to eat their gold partners by introducing Archiving 1.0 in exchange 2010, but those "in the know" realize that true to MS form, it won't be ready for production until Service Pack 3.
That said, for a green field implementation, it is probably OK to start here. For those that have stubbing solutions already deployed, it is a non-issue.