Follow us on Twitter
Follow us on YouTube
Would you like to suggest a topic for the Exchange team to blog about? Send suggestions to us.
Exchange 2007 Service Pack 1 introduces several changes in the Extensible Storage Engine (ESE). Two of these are:
But first, in order to understand what is discussed below, we should have a brief discussion on ESE.
ESE Architecture
As you are well aware, operations that occur within Exchange are written to the transaction logs, changes are made to the database pages in memory, and then eventually those changes are flushed to the database file.
Thus, ESE utilizes a transactional architecture; specifically ESE follows the ACID methodology. ACID is an acronym and means:
The Extensible Storage Engine utilizes several components:
Note: For more information on B+ trees, please see http://en.wikipedia.org/wiki/B%2B_tree.
So how do the above components work together?
For more information, please see Understanding the Exchange 2007 Store at http://technet.microsoft.com/en-us/library/bb331958.aspx.
Page Dependencies Removal
As an ESE database is updated, B+ tree splits and merges mean that records have to be moved from one location in the database to another. The easiest way to log the record move is to log a deletion ("record A was removed from page 13") and an insertion ("record A was inserted into page 14"). The disadvantage of this approach is that the amount of data logged is proportional to the size of the records being moved. As the original insertion of the record ("record A was inserted into page 13") was logged it is redundant to re-log the same data.
To overcome the issue of logging multiple copies of the data, ESE uses pages dependencies to reduce the amount of data that is logged during B+ tree splits or merges by forcing pages to be written to disk in a specific order. Instead of logging a deletion and an insertion we want to log a move operation ("record A was moved from page 13 to page 14").
Suppose there is a crash after record A is moved from page 13 to page 14. If pages 13 and 14 can be flushed to disk independently then there are four possible states for pages 13 and 14 in the database:
Page 13
Page 14
Recovery Action
A
<blank>
Move A from 13 to 14
Nothing - A has already been moved
Delete A from page 13
Disaster - record A has been lost!
In the last case, page 13 was written to disk after record A was removed but page 14 (which now contains record A) has not been written to disk. To avoid this happening a page dependency is created between pages 13 and 14 so that page 14 must be written to disk before page 13. This means that after a crash only the first three cases above are possible.
What are the benefits to this approach? Page dependencies are a cunning way to reduce the amount of data logged when records are moved between pages in the database. If we only have to log the record's location when moving it, instead of the actual record itself, then we are reducing log generation, which ultimately reduces capacity requirements.
But this poses several problems:
So how can we address this situation? The simple answer is to remove page dependencies. Instead of creating page dependencies, we can simply log the entire source page when performing a record move or split. At recovery time, if the destination page has not flushed (e.g., it doesn't contain the data), but the source page has flushed (e.g., it doesn't contain the data either) the logged page image can be used to redo the data move.
So what effect does removing page dependencies in Exchange 2007 SP1 have?
However, removing page dependencies does have an impact. Page dependencies were originally conceived as a log optimization technique. By removing them, we now have to log the data being moved which means that log generations increase. Internally, we saw a 33% increase in log generation after we disabled page dependencies. The increase in log generation affects other things as well:
Right now, many of you may be thinking "Yikes, SP1 is going to pwn my log drives and since I didn't account for a 33% increase... darn you Exchange!" Relax; we have you covered. Here's how we addressed this situation.
Disabling Partial Merges
While removing page dependencies provides us many benefits, the increase in log generation and the repercussions of this increase are not ideal. So we had to find a way to reduce log generation without breaking the ACID rules. To solve this issue, we went back and analyzed the logs generated on servers. After disabling page dependencies, what we found was that a significant portion of log generation increase is due to the automatic online defragmentation of the database.
Online defragmentation (OLD) is a process used to free up pages in the EDB file. This reduces the number of pages that have to be visited in order to locate or insert data into the appropriate place. Essentially what happens is that the OLD process navigates to the end of a B+ tree and starts moving records from the left most pages to the right most pages, collapsing the B+ tree as much as possible. In many cases, the engine merges the records from one page to another without actually freeing a page; this is known as a partial merge. The hope here by doing partial merges, is that during the next OLD pass, the page will be able to be freed.
Partial merges were useful back when disk sizes were very small (think back to Exchange 5.5 days), since it was important to utilize the capacity and I/O effectively to ensure that every last byte on the storage was used effectively to make the database as dense as possible.
However, partial merges have consequences. As many probably have witnessed using Performance Monitor, OLD is an extremely disk write I/O intensive process. In addition, since we are moving data around within the database, the operations need to be logged, thus making OLD log generation intensive as well. The database churn that occurs during OLD also has another side affect that customers saw with the release of Exchange Server 2003 - snapshots via VSS are rather large due to the fact that a significant portion of the database changes each time OLD executes.
So what would happen if we disabled partial merges? We disabled partial merges, and two things were found:
1. With partial merges disabled, databases are not compacted as tightly. With partial merges disabled, we will only move records from one page to another if we can free up the entire source page. As a result, there is some bloat to the database, however the bloat is small and does not increase drastically over time. For example, consider the following server that had a 162 GB and a 171 GB stores. A stress test was performed and we analyzed the difference between having partial merges enabled and disabled. The end result is that, after four weeks of having partial merges disabled, the database file only increased in size about 2%.
4 week stress test
Partial Merges Enabled
Partial Merges Disabled
DB Size pre-defrag
162,588,409,856
171,188,961,280
% Available Pages
1.80%
1.33%
DB Size Post-defrag
158,247,026,688
163,652,911,104
% Difference
2.65%
4.45%
2. With partial merges disabled, the database churn and log generation numbers significantly decrease when OLD runs. In the following example, you can see another comparison between two storage groups, one that has partial merges enabled and the other has partial merges disabled. On the storage group that had partial merges enabled, 20 GB of the database was manipulated due to partial merges and 18 thousand log files were generated. Whereas, on the storage group without partial merges, only 5 GB of the database changed and only 13 thousand log files were generated. That's a reduction in database churn of ~80% and a reduction of ~25% in log generation.
3000 Mailbox server (8hours)
OLD Page Reads
56,160,000
57,830,400
OLD Pages Dirtied
3,830,400
691,200
OLD Pages Freed
392,457
201,600
BTree Partial Merges
2,494,080
0
Database Churn (GB)
30
5.5
Log Files Generated
18,269
13,701
In addition to OLD, we also found that partial merges were performed during normal runtime. Continuing with the 3000 mailbox server, we noticed that there was an average of 3 B+ tree partial merges/sec over a 24-hour period after we disabled partial merges in OLD. Each partial merge equated to roughly 3 page touches(dirtied), which over a 24 hour period resulted in 6000 logs being generated (the server generates around 110 thousand logs a day). By removing partial merges during normal runtime, we saw an additional 5% reduction in log generation.
The benefit that partial merges provide in terms of database compactness is heavily outweighed by the cost to achieve that compactness (database churn and log generation). In the end, disabling partial merges netted us a reduction in log generation by 30% and substantially reduced our database churn during OLD.
Log Generation Numbers & Message Profiles
Even before we started coding SP1 we knew we were going to remove page dependencies. At the time we knew there would be a growth in log generation, and we didn't know how we would curb it. We assumed the worst case in that we would ship SP1 with a growth in log generation.
So back in January of 2007, we released the storage calculator (http://go.microsoft.com/fwlink/?linkid=84202) and updated the storage design article (http://technet.microsoft.com/en-us/library/bb738147.aspx). One of the guidance changes included this table which associated log generation with the message profile:
Mailbox profile
Message profile
Logs generated / mailbox / day
Light
5 sent/20 received
7
Average
10 sent/40 received
14
Heavy
20 sent/80 received
28
Very heavy
30 sent/120 received
42
What we did not tell you at the time was that the values for the Logs generated / mailbox / day row included an increase for the page dependency removal. Now the good news is that since we disabled partial merges, the log generation growth experienced by removing page dependencies was canceled. As a result we are changing our log generation guidance to be as follows:
6
12
24
36
Note: We will be updating the storage calculator and our storage guidance documentation on TechNet as a result.
Conclusion
To summarize, while disabling page dependencies included a 33% increase in log generation when compared with RTM, we were able to mitigate it by disabling partial merges. The end result is that we now have the following benefits:
- Ross Smith IV