Even though some of the very old KBs, which now refer to unsupported products, state that taking databases offline to run periodic offline defragmentation with ESEUTIL is not recommended some folks in the field still want to do this.
Previously when there was only a single copy of a database, running offline defragmentation would cause minimal impact, apart from the time required to do the defragmentation process which could be several hours or longer depending on database size and disk throughput. This changes when we consider having multiple copies of a database in a Database Availability Group (DAG).
So you may be wondering how best to defragment Exchange 2010 databases that are in a DAG as people often look at the white space in a database and seek to immediately reclaim it.
In short, this is not a good idea for a couple of reasons:
Please note that we are discussing offline defragmentation via ESEUTIL /D, and not online maintenance routines that now run 24 * 7 in newer versions of Exchange and in online maintenance windows in previous versions.
What happens when an Exchange database is defragmented using ESEUTIL /D? The defragmentation process will copy out valid pages of ESE data from the old database file to a new database. This process leaves white space behind as it does not contain data. You will note that I specifically said new database. This has a different GUID than the original database. Creating a database with the same name, but different GUID, means that Exchange sees them as different databases not as multiple copies of the same database.
This will result in errors like the following since the databases are not copies of one another. Errors that may be seen include, but are not limited to:
Let’s look at an example of the impact caused by running offline defrag against a database that is replicated in a DAG.
We shall defragment database, DB01. Our starting configuration has two copies of this database and all is currently running well.
So let’s dismount DB01, and then validate that the two mailbox servers have the same GUID for DB01. We are using ESEUTIL /MH to dump out the header from the database.
On the first mailbox server we see the Rand of 2733649. The GUID is displayed in the ‘DB Signature’ line and is the 'Rand’ value. Be sure to look at the correct signature as there is a signature for both logs and databases. It is expected that the Rand in these two lines will be different.
On the second mailbox server we see the same Rand of 2733649, you can see the server name in the title bar of the PowerShell window.
We have shown that the same database is present on both servers, i.e. both copies have the same Rand of 2733649.
Let’s now defragment DB01 on the first server, then see what happens……
Then let’s check the Rand to see if the old value of 2733649 is still present:
Nope, It’s not. The Rand is now 143007541. That shows that this is a different database. Same name, but this is a different database.
Trying to activate the database copy on another server will create a sea of red in the application event log. You will receive the errors listed above, and the most descriptive is Event ID 4807:
At this point since the databases are no longer copies of one another we will have to re-seed the copy of the database. Depending upon database size, disk throughput and network capacity this can take an extended period of time. Let’s use PowerShell to re-seed the database copy:
Update-MailboxDatabaseCopy –DeleteExistingFiles –Identity DB01\Consea-MB2
This will have to be repeated for all database copies of the database in question. If there are multiple copies over a WAN link then it would be a good idea to manually specify the seeding source using the –SourceServer switch. That way one copy can be seeded over the WAN, and other copies can then use that as a local source, thereby minimising WAN traffic and decreasing time.
Note that there are multiple options worth checking out with Update-MailboxDatabaseCopy. They include options to explicitly choose a network, encryption and compression. Chances are if you used Exchange 2010 RTM then you are quite adroit at using the –CatalogOnly switch!
When the seeding task completes, we can check that the database copies are OK
Checking the Rand on the updated copy of the database, we can see that it has been updated and now has the same Rand which was generated by the defrag, 143007541.
Having to take a database offline for hours to defragment, and then manually reseeding all of its database copies is pretty painful. Is there a better way to do this?
There certainly is!
Since Exchange 2010 introduced the online mailbox move feature, it is now pretty seamless to perform mailbox moves to a new mailbox database and when the old database is empty, simply delete it! This process can be made even better with use of the SuspendWhenReadyToComplete parameter. As an example:
New-MoveRequest -Identity 'User-21' -TargetDatabase DB01 –SuspendWhenReadyToComplete
This copies the vast majority of the mailbox content and then pauses. The administrator will manually resume the move request using Resume-MoveRequest. So this means we can copy mailbox content through the day with no user impact. After hours the suspended move can then be rapidly completed. This has to be one of my favourite Exchange 2010/2013 features!
The same logic can also be applied to a mailbox database that must be evacuated for other reasons. This may be necessary if file system AV has scanned the database as it will be in an unknown and thus unsupported state.
Note that the Mailbox Replication Service (MRS) is throttled, and if you wish to apply a little accelerando to the move process then you will need to take a look at the throttling configuration.
Cheers,
Rhoderick
>>>
If you would like to have Microsoft Premier Field Engineering (PFE) visit your company and assist with the topic(s) presented in this blog post, then please contact your Microsoft Premier Technical Account Manager (TAM) for more information on scheduling and our varied offerings!
If you are not currently benefiting from Microsoft Premier support and you’d like more information about Premier, please email the appropriate contact below, and tell them you how you got introduced!
US
Canada
For all other areas please use the US contact point.
Rhod, great article as always. In a follow-up article I would love to see comments as to how long a db defrag can take and why trying to reclaim white space is often a fool's errand. Cheers and take care...sc
Howdy Sean!
Very true, I can't recall when I last did an offline defrag to reclaim space. Last time would have been many, many moons ago!
Say hello to Mr Thiessen for me please :)
Got a small query here regarding seeding...
Consider 2(P) + 1(DR) copies of avg DB size 500GB.
Copy/Replay status seems to be normal i.e. 0. Suddenly something went wrong like log file got missed, ending up in disk space issue(P).
Action plan - you now you cannot either activate nor truncate the logs.
Dismounted the DB(P Active) moved the logs because it was healthy & mounted.
Q - having the passive copy node *.edb file(500GB) can we use this as incremental updating the DB instead of complete DB seeding between nodes in P & DR site.
I tried with –SeedingPostponed / -Force switch but no luck - does it works really...or there is no other option but to complete reseed...?
Hi Charles,
Missed this over the long holiday weekend.
Did you do what I have above and dump out the databases headers with ESEUTIL /MH ?
What I expect you to find is that on the active instance, the database is updated to use the new log stream but on the passive that copy of the database file is still looking for the original log stream GUID. Since it does not match the passive DB will not attach to the log stream.
Does that match your observed behaviour ?
So you are right about the new log stream as it will not attach but is there any way we could use the same passive old DB for incremental seeding instead asking system to deleteexistingfiles and seed.
Don't think so but, let me have a think about that please. Things are a bit hectic at the mo, and I'll reply when I get a chance.
Hmn I understand - Not able to figure out any workaround but will hope anything to come up in future may be seeding via WAN to DR wouldn't be feasible for DB size from 500GB-2TB