One of the big shifts that continues to reshape the software industry is the rapid drop in the cost of storage. These costs are dropping so rapidly that in addition to the absolute costs dropping rapidly there is also a rapid shift in the relative cost of storage with respect to other physical resources or sources of hardware cost. Yes, CPU and memory are becoming much less expensive pretty quickly but the cost of storage is dropping at a rate that dwarfs all of the other inputs.
In principle that means that there should be opportunities to redesign systems to take advantage of the changing ratios. In this video we look at a pretty cool example where the design of Exchange Server 2013 does exactly that. Specifically, how we used some extra storage to reduce a bunch of redundant calculations to reduce the TCO of the system by changing the way we perform indexing operations in Exchange.
A couple of side points:
1) In previous releases, the TCO of the system was lower because the space savings were worth the extra calculation time. It is only now that these 'wasted' calculations are a net loss.
2) The TCO win is enabled not just by lower disk storage costs but also by a replication based backup/data protection paradigm. Without the investment in Exchange HA this approach would probably still not be a net TCO win.
The changes to search are not just about TCO changes. With the change to the underlying search engine, search is faster, there are new user experiences, as well as new features which enable eDiscovery across Exchange, SharePoint, and Lync. Related to the eDiscovery improvements there are also new in-place hold improvements which enable you to specify which items to hold by using query parameters such as keywords, senders and recipients, start and end dates, as well as specifying a duration for items on hold.
Keep watching and let me know if you have questions, or other topics you want me to talk about.
Well the new version of Exchange is almost ready to be released and the first MEC in a decade is starting in a few days! So we can actually start talking about what is in the next version of Exchange for our on-premise customers and for our the next release of the Office 365 service.
To kick things off this new video is a fairly high level overview of some of the core design changes in the product. This video is a little longer than usual but it’s got a huge amount of good information. Here are some of the things we are particularly excited about:
MEC is only a few days away, and I’ll be explaining more about why we build Exchange during my day 2 keynote. Do you have an Exchange question you want me to answer? Email email@example.com or tweet your question using the #askperry and #MECisback tags!
And some bonus information …… One of the interesting things about this release is that it represents the culmination of a 3 release path we started down when we planned Exchange 2007 way back nearly a decade ago. If you are interested, here is a video that was filmed a while ago but was put on ice since we couldn’t talk about the new version of Exchange until we released the preview …..
See you at MEC!
In this installment we talk a little bit about how we have built on the underlying RBAC infrastructure and PowerShell to enable zero elevated access in the service. This is a very important part of the overall access control system that makes sure that anyone who needs to administer the service is protected from accidentally modifying the service because they need to go through a process to get privileges before they start. It also makes sure that only the people with the skills and background checks can ever access powerful administrative functions and even then only with someone else giving final go-ahead. That we are able to run with no elevated privileges all the time without impacting the productivity of the people needed to manage the service has given us a lot of confidence in the RBAC/PowerShell model.
If by chance haven’t seen my conversation on RBAC, it might be good to check it out here first. Also this link to the Office 365 Trust Center is great background.
Welcome back to lovely sunny Seattle/Redmond, figuratively speaking of course. This video is a discussion about our approach to moving mailboxes—the Mailbox Replication Service (MRS).
Starting with Exchange 2007 SP2 moving forward, administrators can create long running move jobs for load balancing, migrating between versions, replacing and upgrading hardware and for making the jump to the cloud. The architecture and approach will remain consistent for all current versions of on-premises Exchange and Exchange Online and for all the upcoming versions we have even glimmers of planning for.
Moving mailboxes is a key part of managing an email service for the long-haul. There will be updated technologies, new business priorities, mergers, acquisitions, hardware and version upgrades. So having a robust, fire and forget, throttled system that enables administrators to accomplish this work while keeping the moves transparent to actual users (by making sure they stay on-line throughout the process) is absolutely key and making sure that the process scales up gracefully as mailboxes move from Gigabytes to 10’s of GB to 100’s.
I was amused recently to find out that one of our cloud competitors is trumpeting the benefits of their simple cloud story, by pointing out how happy users were with their ‘Fresh and Clean’ mailboxes after the ‘migration’. By fresh and clean they meant empty. This awesome and simple ‘migration’ simply creates new mailboxes and simply orphans the data in the original. If providing a service that makes sure that email doesn’t get lost and is available to users every day makes sense, then our perspective is that it is also important to keep user’s mail intact as they move to the cloud.
In this video we spend some time talking about an aspect of our archiving and compliance story for Exchange on-premises and the service. Specifically, if you do decide to take advantage of the simplicity and lower costs of co-locating your ‘archive’ data with your primary mailbox data, (see here for a previous discussion about why we think this approach is better, cheaper and simpler) do you have to give up on immutability?
The short answer is an emphatic no; not only do you get a great immutability story, but you get one that provides more finely-tuned control over the content that you decide to make immutable. The slightly longer answer about how this works and how Exchange provides built-in immutability policies that allows companies to comply with important regulations and internal is in the video. If you want more detailed whitepapers on the subject please see here.
You may have also seen that we recently released PST Capture – this is another step in putting your rogue .pst files onto Exchange where you can get control of them and start treating that data with the same immutability story.
It was suggested to me that I provide some more information on a few specific things I talk about in the video. So talking about Documents, I of course meant any email or attachment or other item that you happen to have stored in your Exchange Mailbox. Also, when I talk about ‘’Dumpster 2.0” that really is only the internal name for our retention hold functionality and ‘officially’ it is called the Recoverable Items Folder.
I also made a reference to copy-on-write - for those that are interested, copy-on-write is a term that references an approach to handling modifying content without destroying the original content and involves making a copy of an item before writing. In Exchange, when the user wants to modify a message or other piece of content that is marked for retention hold (immutable) the server first makes a copy of the message then moves the original to the Dum…The Recoverable Items Folder and then makes the modifications to the copy.
I'm always interested in any comments you have on Immutability, Archiving or other Exchange topics you want me to talk about.
This video provides some answers to the question we’ve heard a lot – How can you provide large mailboxes in the service at such a low cost? In the video, I talk about some of the ways that enable us to get the costs of our service so low, which helps us provide a low cost service for our customers. We focused on things that are really intrinsic to running a broad service.
The video covers few different areas - Time Averaging, Efficient Data Centers, User Profiles and how the things we’ve learnt from the service can help customers who are running their Exchange system on-premises. One additional thing that we didn’t get a chance to talk about, which also helps reduce costs dramatically, is the benefit we get from using PowerShell to automate almost every aspect of our deployment, monitoring and management of the service. Since people costs can be a large part of deployment and management costs, this automation enable us to streamline our operations and are very important to the economics of the service.
As well as the cost aspects, we talk about some of the Green aspects of running a datacenter, and if you want more information about this, take a look here: http://blogs.technet.com/b/msdatacenters/
Oh, and just to clarify the part of the video when I talked about paying attention to the profile of the users in the service - I was only talking about the statistical aspects of the aggregate load. We don't look at what a particular person or organization is doing, nor do we read their mail! That would be invading people's privacy which we don't do. The Office 365 Trust Center has more details about how we think about this: http://www.microsoft.com/en-us/office365/trust-center.aspx
Want to know more about how we built Exchange? Use the comments section to send me your thoughts, ideas and questions.
In my last Geek Out video we talked about the interesting technical differences in building a service versus building software for companies to run on-premises. This time we spend some time talking about the cultural shift the team went through as services became one of the main products of the Exchange team and the key changes we made to ensure that the right cultural shifts happened.
One of the things that was very important for the shift that we talk about is that component teams (usually about 6 devs, plus testers and Program Managers) are responsible for the monitoring of their component as part of their product work and deciding which alerts from their monitors should be pageable and for handling the paging alerts unmediated by any other humans on an operations team. This model of operating keeps the team closely engaged with the service, keeps us learning and helps us to keep improving the service.
While the model provides great benefits for our service customers in terms of availability, and ongoing improvements to the service, it also has some positive effects on the team. Interestingly, the component teams across Exchange that have had the most active engagement with the service (and consequently the most on-call load) have the best morale. My personal theory is that the sense of satisfaction from seeing such direct impact from your work is the key driver for the improved morale of the team. The change in culture and how we think about customers is captured in a recent video with the team:
Want to know more about how things work in Exchange? – post a comment, ask a question!
During TechEd North America in Atlanta earlier this year, Ann Vu invited attendees to record the questions they wanted me to answer. When she got back, she interrupted a meeting I was having in my office with Matt Gossage to get some answers to questions about long-term email preservation, database recovery, clustering concepts, site resiliency, and even got me talking about why I love Exchange.
Turns out that it was highly fortuitous that it was Matt who was in the office at the time since many of the questions were about Exchange 2010 High Availability (HA). So, we this turned into a great opportunity to get answers directly from one of the key individuals behind our HA investments. Some people might think this was all too convenient perhaps even staged. All I can say is that the world would be simpler with a little less cynicism.
Thanks to Martin, Eamon, Mitch, Doug and Matt who took the time out to record questions. Keep posting those comments and asking me the questions you want answers to.
It has been a while since my last posting. We have been pretty busy dotting the i's on the new Office 365 service, and deploying the beta (which has gone very smoothly).
The conversation this time is centered on the framework through which we have tackled the challenges of making sure that we can deliver a highly reliable service. In this video, I cover some of the areas the Exchange team has focused on as our responsibilities have evolved from selling on-premises software to also encompass managing a service.
The most interesting aspect, is the extra focus that is applied to making sure that the isolation models in the service are well thought through. On-premises scenarios provide isolation for free since each company's deployment is very heavily isolated from every other one. To get efficiencies of scale, cloud services create the risk of tightly coupling the whole deployment so that one mistake or design flaw can cause failure for millions of customers.
In thinking through isolation we look at the important usage scenarios or resources within the service and try to identify a coherent set of nested isolation levels. To randomly pick a case, say storage, there are a lot of isolation levels that have meaningful levels of isolation that are brought to the table. Specifically, the Exchange service relies on a number of different layers of strictly nested isolation. For example: messages, mailboxes, database, server, database availability group and forest. In addition there are is datacenter and regional isolation layers that help but aren't in a strict hierarchy with the core layers.
This talk is centered on the framework or philosophy by which we think through our designs for reliability and availability. In future blog postings I think it would be interesting to work through some examples or case studies of these scenarios. Please let me know if there are any particular areas that would be of interest.
It is pretty exciting to see HP release a new hardware solution that provides a simple scalable solution to build the smallest and largest Exchange deployments. This new solution maps closely to the architectural building block concept that I’ve written about in the past - Blog Post: Exchange Mailbox Storage Bricks
I think what is most exciting about this is the general direction that HP is taking towards storage. They are building a platform that allows you to build very large Exchange deployments that at the core are essentially the same as what you would build for a 500 or 1000 person company and just multiplied x times. The platform has the advantages of low cost, simplicity of scale-out, and consistency of hardware (once you have validated your first block in the deployment you know that when you add your 20th block, Exchange will run just as well on the first block and the last). Also, since HP has already done the sizing and validation for a given user profile and you have access to all the test procedures and results there is an additional scale benefit: much of that work gets done once by HP not once per deployment.
That last scale advantage also applies to all the pre-configuration that HP has done for the solution. This work makes sure that the best practices that the Exchange team recommends are implemented correctly (for example: storage parameters configured when creating LUNs and static ports for MAPI). So the solution is ready 'in-the-box' and it will even run ExBPA for you after deployment so you can be sure that everything is complete and ready for users.
This work is part of a long term partnership with HP. While I am very excited to see this solution and the degree it embraces our storage story, I am even more excited about the work this partnership is going to produce down the road.
Here are some links describing the new HP solution in detail:
Following the ‘why we built Exchange the way we did’ theme, I wanted to take some time to explain some architectural changes that have been made to Exchange over successive releases.
After Exchange 2003 shipped we took a step back and assessed the state of the code base. At that point, Exchange had grown fairly organically over the various releases since the start of coding about a decade earlier. It was clear that there were assumptions built into the core architecture that were no longer optimal given the steady but exponential changes that were accumulating in the hardware we ran on and in the way customers were using the product. One option was to start with a blank sheet and attempt to rewrite the product in one release.
We decided this was not a likely path to success given the large scope that Exchange had evolved to encompass. However, it was also clear that dramatic changes were needed and that we had exhausted our ability to approach Exchange development in a feature-by-feature and release-by-release manner.
The 3-release refresh cycle that we embarked on is a subject of a future video posting. So rather than talk about the broad approach, I thought it would be better to start with an example of an important scenario that we delivered as part of Exchange 2010 that had its roots in some key Exchange 2007 investments.
Specifically, this video focuses on the RBAC story in Exchange 2010. In it I explain how the role based management delegation story in Exchange 2010 is rooted in, and was a key driver for, the work to switch to a verb oriented, task/cmdlet-based system management infrastructure back when we were planning the 2007 release.
This is a great example of a multi-release strategy successfully solving a problem that had eluded previous attempts to build useful management roles in the scope of a single release.
While Ann was at TechEd Europe in Berlin last month, she asked attendees about the questions they wanted me to answer, and rewarded them with some great T-shirts!
When she got back, she visited me in my office to get some answers to questions about Offline Archives, 84 bit MAPI and more….
Thanks for the questions – I’m always interesting in hearing from you, so please continue to post those comments and questions and I’ll be back blogging in 2011. Happy New Year!
There were some interesting comments from the last video entry, and they distilled into a few interesting themes that I wanted to respond to:
1) Perry, you seem to be living on a different planet than I do—one where spindles don’t last much longer than 4 years and where the capacity is doubling every 18 months or so.
2) The key is virtualization
3) It is bad that software companies ‘decide’ when you should do your hardware upgrades
4) Users of Notes/Domino seem to be pretty interested in the Exchange upgrade experience
Before I go through those themes, I thought it would be good to put some data around what we saw in previous releases of Exchange. One key data-point that was looked at carefully during Exchange 2007 planning was, even though the upgrade from Exchange 2000 to Exchange 2003 was a very straightforward process, as painless as the Exchange 2000 SP2 to SP3 upgrade, we were seeing 80% of customers choose a migration route and a large chunk of the upgraders had a considerable remorse when the inevitable migration of the hardware platform hit them a few months later.
In some ways the product choices made back then constitute a good case study now that the results can be measured. In Exchange we had some tough decisions about how hard to bet on 64bit architectures and whether to make some other pretty significant architectural improvements. We went down this path—optimizing for 64 bits heavily, dropping 32 bit support in Exchange 2007 and a set of other big architectural bets. Forward ahead 7 years later and Exchange has drawn dramatically ahead in Enterprise Share with analysts like Gartner giving Microsoft the only “strong positive” rating in their 2010 MarketScope for E-Mail Systems report.
On to the themes…
1) I framed this one a little unfairly. The lifetimes and the Moores Law constant for disk drive capacity growth are facts not opinions (and disk drives have been doubling every 18 months or so very reliably for a long period of time) so it really isn’t these facts that are in question but the implications that people are reacting to. For Example:
“Well, I have the feeling we don't live on the same planet. In my world, the servers hardwares keep running for more than 3-4 years. The storage is not even part of the server hardware but rather attached to it and can be upgraded without impact on the server itself (SAN). The CPU is hardly ever a limit, memory is. We, real admins, all know that memory can be upgraded fairly easily and we take hardwares that can evolve. On the other end, that's true that roll-out migration gives up more work. We don't have enough, I guess.”
Ok, so you may be using a SAN but the fact remains even SANs benefit from the underlying trends. If you wait longer than 4.5 years to upgrade (8x improvement in $/byte) you can double the server storage and cut the storage cost by enough to get a less than 12 month payback just on the maintenance cost on the SAN. You can get paybacks that are measured in a couple months if you replace the expensive SANs with DAS storage and take advantage of the big HA investments in Exchange. Going through the upgrade overhead and without taking advantage of the opportunity to drastically reduce your costs is not rational.
2) Virtualization is basically orthogonal to the storage issue. If the on-disk structures change, virtualization of the cpu-nodes won’t eliminate the need for a migration. It will make deploying the new image on your servers pretty easy. BTW if you are thinking about using virtualization and the implication of that is to consider SAN’s, don’t. You can read more about my views of Virtualization here - Storage performance and my take on virtual storage
3) I would agree, that it would be bad for Exchange to decide when customers upgrade their hardware. We don’t. Admins get to decide. We have a strong yearly cadence of upgrades that includes significant new functionality through the Service Pack pipeline, and these are strictly in-place. For our major release cadence we have found over the past two releases that significant physical schema changes drove huge ROI’s for our customers, however it is possible to reuse your hardware (and by implication forgo the ROI’s). More importantly, you can upgrade and largely reuse your existing hardware even for the major releases. This is especially doable in a virtualized space where you CPU nodes are very fungible and you are backed by a SAN-base storage utility. It is basically a rolling upgrade. It has the benefits of always being reversible, with no user downtime and should not require significant new hardware. Again, you will necessarily miss out on most of the economic benefits of the upgrades. But you do have the choice.
4) Great to see some Lotus fans on the site. If, as so many of your peers have already done, you are getting interested in a migration to Exchange, it really is pretty easy. This link is a great starting point: www.whymicrosoft.com/ibm
After a brief break from blogging, I was back in the studio recently recording more videos to answer questions about Exchange design decisions.
Now most of the time I talk to customers about new functionality that we have added to Exchange, and for the last few blog entries I have focused on some of these changes related to storage and archiving.
In this entry I wanted to cover another topic which gets asked about a lot – why did the Exchange team not include an in-place upgrade option in the product in recent versions? Is it that the Exchange team is filled with a bunch of lazy developers or are there valid reasons for doing this?
I was going to stop there and let you watch the video (which I encourage you to do!) but I thought I’d give you a sneak peek into what the answer is, so here are some of the major points:
The video covers a lot, including the online mailbox move feature. I'm interested in any questions or comments these answers generate. Let me know what you think.
This blog business has been turning out to be more fun than I expected - maybe I should be nicer and less sarcastic when the marketing team suggests these ideas.
This time, answering the questions from my archiving blog and video have led into some areas which are a bit more controversial….. there are two key issues that we talk about in my latest video:
1) Tiered storage: is it a great way of reducing costs? 2) Stubbing approaches to archives: the world’s most elegant architecture or what?
Tiered storage is a conversation I have been having in a variety of different contexts with numerous storage experts for the past decade. In short, the idea that you can lower your costs by separating out your data into different tiers and putting that data on different types of hardware is only true under very restricted conditions. Given the mass storage hardware available for the past decade however, those conditions simply do not occur for content that needs to be, in any meaningful way, 'online'. So, if you are not thinking that one of your tiers of content is a bunch of data stuffed onto tapes that are squirreled away in a vault offsite and you expect to access only a tiny fraction of that content, then it is pretty clear that the lowest cost solution is to put all of your data, hot and cold together, onto the biggest cheapest drives you can find. End of story. I have been nibbling away at this theme a bit in previous posts but this is an attempt to make it as black and white as possible: tiered solutions for online data will increase your costs. It will make you less green, it will cost you more $$, it will make you less efficient and it will reduce the productivity of your users.
The other issue which was raised in response to the archiving video was related to stubbing – moving the bulk of the message out of Exchange and leaving only a link or ‘stub’ behind. I consider the stubbing approach it be one of those kluges that occur in the software industry regularly that are done out of necessity. In Exchange 2003 and earlier, to avoid the tyranny of tape back-up solutions for an online Exchange store, they were about the only thing many customers could do. Even so, a tiny fraction of customers ever deployed solutions based on this approach. Those that did were not very happy but realistically didn't have any alternatives. Now that there are real alternatives, the complexity, fragility, incompleteness and expense of stub solutions should make anyone thinking about deploying them pause and think for a very long time.
The last blog post generated some great questions, and I wanted to address some of them in a little more depth and continue the archiving discussion.
“We are all for removing pst files from the network; however, the major challenge that we have is handling the use case where users are offline (i.e. not connected and require access to the archive ….)“
“is it possible to include and/or exclude the archive mailbox in the offline file set (.OST)?”
These are great questions because they get at something that comes up a lot when I am talking to customers: "Is there a difference between the 'archive' and a large mailbox?" The issue is important because the answer is at the core of the design decisions made in implementing this feature. The basic idea is that the reason that a user would move content from their primary folders to folders in their archive is because they rarely want to see this data but would prefer it off on the side. This could be just a clutter question or it could be about the amount of data that sits on their client machine. Fundamentally, if a user has .PST data that they consider to be part of their active working set that they always want to be able to refer to, it probably makes sense to keep that data in the main set of folders. Data there will be available off-line through the .OST. Only when the data gets old enough that, either due to space constraints on their client or for risk mitigation it is not longer worth keeping on the client should they need to move it to the archive.
Another way of putting it is that the ability to put content under the archive node is a way to manage the usability of the overall mailbox not to manage costs. To be precise, all the benefits of the low-cost storage design in Exchange 2010 accrue equally to all the content in Exchange. The ability to discover and retain data applies equally to all content. Where exactly the data should reside is about optimizing the productivity of the end-user.
For me, personally, I keep the most recent two years of email in my primary set of folders and then content older than that migrates to the archive to keep things from getting too cluttered. So I have mirrors of folders like inbox and sent items that contain older data and then I have project oriented folders that I move when they are no longer of immediate interest.
So far we have spent some time talking about how Exchange has been engineered to take advantage of large commodity storage while keeping the data protected and the overall system highly available in order to deliver large, low cost mailboxes.
This time I wanted to spend some time talking about how to take those abilities out for a spin and get some concrete benefits. Specifically, taking advantage of the extra storage to deploy an integrated archive at almost no additional cost and replace expensive add-on solutions.
The term archive covers a lot of different scenarios that people have deployed for email. The scenarios include: extending people's mailboxes to replace .pst’s and keep reference data around to reduce the amount of work recreating thoughts and ideas that have been worked through before; ensuring that the data that needs to be kept for compliance or policy reasons is actually retained; to make sure that discovery operations are reliable and cost-effective. The great thing about an integrated approach is that, not only is the cost of any one of these scenarios lower than the alternatives but once you have deployed the archive for discovery and policy retention reasons, users get the direct benefit of an extended mailbox for free.
Ensuring that Exchange 2010 delivers across the complete set of scenarios that people think about when they design archiving solutions did mean working through a suite of features. A lot of the discussion in this video is about how the different use cases are covered by the functionality in Exchange, but I also talk about the future of Exchange Archiving…there is always more work which can be done.
I'm interested in any questions or comments you have about Storage and Archiving.
So far there have been some great comments and questions in response to the first couple of blogs. In the last written entry I responded to one of the comments. In the new video I recorded with Ann Vu, she gives voice to a few of the other great questions that have come through. The core of these questions was basically: Is replication really a complete solution to all data protection and retention needs? The answer is of course no.
Most of the conversation is about the suite of investments that we have brought together that guarantee that the data customers have is fully protected against hardware, infrastructure, human error and logical corruption.
There is also follow on question that can be interpreted as: Really? SQL still uses backup! I wanted to extend that discussion a little and stress a couple of things.
1) There are many applications built on SQL that use replication and log shipping from SQL as the core of their data protection strategy just like Exchange does.
2) At the core I think these questions are about developing trust. So one obvious test is the degree to which we are “eating our own Dogfood” to use an Exchange colloquialism. The fact that Microsoft's entire corporate deployment is based on a replication strategy is a strong answer – in that case we are risking the productivity of over 90,000 employees plus our contractors across the company. One of the easiest decisions was whether we should use backups as our data protection strategy. I hope it is pretty clear what we did there.
This time I wanted to respond to one of the comments from my last blog post " ... apart complementing Perry for a great presentation .. a common mistake done by many companies is mixing two concepts, such as Data Protection and Service Continuity..."
It is a great comment (and not just because he has such great taste in presentations). The interesting question is whether Exchange has been guilty of muddling some of these concepts in our thinking around storage. Depending on the context of the charge we either plead enthusiastically GUILTY! or huffily NOT!
What do I mean by that? I think it is very important for everyone in this space to think through scenario requirements quite rigorously. Without a clean understanding of requirements and their strict prioritization it is unlikely that a particular deployment will actually meet the needs. The biggest enemy of creating clean, prioritized requirements is fuzzy thinking about needs such as: confusing service availability with data-loss; handling smoking-hole-datacenter implosions versus handling planned datacenter downtime; listing backups when the actual requirement is retention. In this context I think the Exchange team has done a pretty good job of teasing out and separating the underlying requirements that customers have--very often by reverse engineering why they have done things from the many interesting deployments they have created.
The other context is implementation. In order to take cost and complexity out of a system it is important to look at requirements and find commonalities that can be solved with a single elegant architecture. Sometimes it is much better to implement solutions in refined and elegant but stand-alone ways. The automatic egg-cooking-coffee-making toaster does have some common elements but should largely be viewed as Rube-Goldberg monstrosity. However there is also the expensive monstrosity that comes together from continuously bolting together one off pieces to solve many different problems rather than stepping back and implementing a comprehensive solution that has fewer moving parts. In this context we have been explicitly guilty of muddying the boundaries between solutions for the large set of requirements that exist in the storage space.
The central architectural concept that has been our leading light over the past two releases is the concept of Exchange Mailbox Storage Bricks. The core idea is borrowed from Jim Gray ( http://www.usenix.org/publications/library/proceedings/fast02/gray/index.htm and http://research.microsoft.com/apps/pubs/default.aspx?id=64151 ) -- essentially that a simple scale-out model based on a basic storage+compute building block that provides linear scaling characteristics and relies on shared nothing clustering is the best approach for cost effectively deploying storage for email servers.
There are many benefits for this model, but the one I am going to stress here is the win from the shared nothing aspect that you only get by combining service continuity and data protection. In the shared storage model there is a clean separation between server failover and data-protection. But this ends up being a very complicated system when you think about the storage network necessary to maintain the multiple links between each compute node and each piece of storage. Plus there is a huge amount of complexity that needs to be built into the system to guarantee that never, ever, ever, do we allow more than one compute node to write to the same unit of storage. In the shared nothing space, compute nodes are tied to their storage directly and exclusively. All that complexity around the fabric goes away and you are architecturally guaranteed to never have more than one writer to a database.
There is one further win from the brick model that is in many ways the most important which is about the nature of redundancy. If you have two FULLY independent systems the availability of the overall system can be calculated very simply: you simply concatenate 9's. For example the availability of at least one of 3 fully independent systems each with relatively moderate availabilities of 99% is 99.9999%. In most real-world cases things are not fully independent so your mileage will vary dramatically--for dependent chains your availability is no better than that of the weakest link. The challenge is to find ways to make things truly independent. The brilliance of the brick-model is that you end up with individual nodes that are much more independent and the nodes themselves while not particularly reliable can then produce extremely reliable systems. In the shared storage world you simply do not get this independence benefit.
I'll be posting another video next week to answer some of the other questions from the data protection blog post - stay tuned!
Well, dealing with the creative responses of my co-workers after the first video was so much fun -- comments like "The man may excel at computational data storage and analysis, but he needs to learn a thing or two about whiteboard real estate" So, why stop? The answer is: we already taped the second one and the marketing team really doesn't understand the concept of sunk costs.
In the new video we spend some time talking about benefits of the replication based storage model over backups for protecting your data.
Basically the issue comes down to a couple key benefits of any replication based strategy:
1) Since you are replicating continuously, you don't have a discontinuous process that is intrusive to regular operation that needs to run regularly and more importantly with replication you only need to back up each piece of content ONCE. With a backup model you have to backup each piece every time you do a full backup. This means that the cost and complexity of backup becomes unsupportable once mailboxes get very large. Replication really only copies the mail once so it is continuous and the cost and complexity are proportional to delivery rates NOT how long each mail is kept.
2) Because the copies are fully up-to-date all the time and are true, validated replicas, the time it takes to restore (i.e. get the backup up and running) is much faster in a replication model.
There are more benefits of replication that aren't covered. Things like being much more secure against logical corruption in the storage stack because the write paths are so different on the primary and replicas and the ability to spread the replicas around a continent and get a true disaster recovery benefit. But we can go into more detail on those issues another time.
At this point, I suppose some might be wondering if classic backups are good for anything. Well some people might very well think that. I couldn't possibly comment. At least right now... But I'd like to hear what you think!
There were some good questions raised from my first blog post which I'll answer here.
First, yes Exchange 2010 should seem a little snappier on the similar hardware compared to Exchange 2007 and especially compared to 2003. But the snappiness should be almost a small side effect of the overall IO efficiency improvements if the hardware was properly sized for the original version. Overall we have seen an improvement of 2-4x in the number of io's necessary to support a given user profile between 2007 and 2010. To get the full benefits of the new version you do need to design your hardware deployment properly following both our guidance and that of your storage vendor's.
Ok, so how did we do it? Getting such a large change in the performance characteristics of the system required a lot of changes. However the conceptual core of the changes the 'theme' as it were was about finding ways to taking small random io's and make them bigger and more sequential. Disk drives are so dense that the amount of time it takes for the head to sit and read 64kB off the platter compared to reading 4kB of the platter is small compared to the time it takes to move to the right track and then wait for the stream to rotate under the head (this is especially true if you are using green drives that use much less power because they rotate slower). Since that is the case if you can combine sixteen 4kB io's into one big 64kB io you get close to a 16x io improvement. The biggest changes we made to get this win were to:
We also got some of our wins by improving our cache efficiency there were a couple of wins there:
This webcast goes into more detail about the Exchange 2010 storage changes -- https://msevents.microsoft.com/CUI/WebCastEventDetails.aspx?EventID=1032418921
Now, what about virtual storage?
I have been accused of being a bit of an anti-virtualization bigot. But the truth is I am a huge fan and I have seen the potential benefits first hand when I spent some time working in Microsoft's IT department. There are many LOB applications in most companies that consume relatively small chunks of storage and CPU. However, in a dedicated model there is a minimum practical amount of storage that can be deployed per application if it is on its own hardware. So our own IT group used to have thousands of applications with storage utilization rates at less than 20%. By creating a central storage utility that is shared across many applications (the disk drives that the servers connect to are 'virtual') it is possible to get much higher average utilization rates while still providing room for spikes in load and growth. The cost savings can be dramatic.
However, with most Exchange deployments, the amount of data involved is so high that there isn't usually a problem getting very good utilization factors and often the SANs are used in a very dedicated model for their Exchange deployments. Without getting great capacity utilization wins it is difficult to overcome the large per spindle and per bit cost overheads associated with these approaches and the complexity of the systems can be significant.
Typically with a SAN deployment you would design it using a RAID-1 configuration for your primary system with some sort of backup to disk using snaps and then an offload to tape and have a redundant site if you were concerned about geo-scaling. In a JBOD approach, the model is to map a single drive to a single database. Then, you choose the number of replicas you want of each database and the number of physical locations you want across those copies. When you lose a spindle, load is transferred to another spindle on another system. In addition to the reduced complexity of not having a shared storage fabric, you get added availability benefits because the full hardware stack is protected by the application level replication.
This webcast goes into more detail about the Exchange 2010 High Availability changes -- https://msevents.microsoft.com/CUI/WebCastEventDetails.aspx?EventID=1032416677
Well, I guess I really am joining the blogging bandwagon. Although, I suppose blogging has been sufficiently eclipsed by Twitter that it can't really be thought of as a bandwagon anymore. At least not a cool bandwagon. The preceeding is by way of a welcome to my blog.
What's the blog about?
It is not a forum for Exchange announcements, nor a place where you can get the latest Exchange rollup nor what is in it nor what the dates for the next release are. My answers would probably be wrong and there is a perfectly good source for stuff like that: www.msexchangeteam.com
No really, what's it about?
Some enthusiastic people in marketing think that some of the conversations that I have with customers and partners about the Exchange perspective on technology trends in our industry and how those trends shape our key design decisions (especially around storage technology) might be valuable to other humans. I have resisted that input for years because talking to customers one-on-one is fun and writing stuff down in a blog sounds like work. They promise me it really will be fun and I don't need to worry about being overly edited, so here goes.
That still doesn't answer the question does it? Well, I don't really know yet. Certainly there are some questions that I think it would be great to cover directly or indirectly such as: Does Exchange really hate SANs? Can Enterprises really live without a backup? Isn't tiered storage the greatest thing since Saran Wrap? Don't you need Enterprise disks to run your Enterprise on? I hope the topics that get initiated here are just the jumping off point for more interesting conversations. It is the back and forth that makes talking to customers directly so interesting.
Who am I?
I'm Perry Clarke -- Currently, I run the Exchange Mailbox Server Engineering team. That has been my role through the Exchange 2007 and 2010 releases and we did make a lot interesting bets through those releases we can talk about. Now, I finally have time for things like this.
To get on with it: I recently spent some time chatting to Ann Vu about disk capacity and I/O efficiency and the marketing team turned it into a video (the first of a video series).
I'm interested in any questions or comments the answers spark - what's your view on storage trends and how we've architected Exchange?