250 Hello

Random Musings on Exchange and Virtualization

Offline Defrag And DAG Databases, Oh My!

Offline Defrag And DAG Databases, Oh My!

  • Comments 7
  • Likes

Even though some of the very old KBs, which  now refer to unsupported products, state that taking databases offline to run periodic offline defragmentation with ESEUTIL is not recommended some folks in the field still want to do this.

Previously when there was only a single copy of a database, running offline defragmentation would cause minimal impact, apart from the time required to do the defragmentation process which could be several hours or longer depending on database size and disk throughput. This changes when we consider having multiple copies of a database in a Database Availability Group (DAG).

So you may be wondering how best to defragment Exchange 2010 databases that are in a DAG as people often look at the white space in a database and seek to immediately reclaim it.

In short, this is not a good idea for a couple of reasons:

  • Defragmenting DAG databases leads to more work
  • Mailboxes are offline while the defragmentation completes
  • This is generally a short sighted view as white space will be re-used

Please note that we are discussing offline defragmentation via ESEUTIL /D, and not online maintenance routines that now run 24 * 7 in newer versions of Exchange and in online maintenance windows in previous versions.

Background

What happens when an Exchange database is defragmented using ESEUTIL /D?  The defragmentation process will copy out valid pages of ESE data from the old database file to a new database.  This process leaves white space behind as it does not contain data.  You will note that I specifically said new database.  This has a different GUID than the original database.  Creating a database with the same name, but different GUID, means that Exchange sees them as different databases not as multiple copies of the same database.

 

This will result in errors like the following since the databases are not copies of one another.  Errors that may be seen include, but are not limited to:

  • An Active Manager operation failed. Error Operation failed with message: MapiExceptionJetErrorAttachedDatabaseMismatch: Unable to mount database. (hr=0x80004005, ec=-1216)
  • The Exchange store database <databasename> copy on this server appears to be inconsistent with the active database copy or is corrupted. For more details about the failure, consult the Event log on the server
  • Event ID 494:  Database recovery failed with error -1216 because it encountered references to a database, 'database path', which is no longer present
  • Event ID 454: Information Store (PID) <databasename>: Database recovery/restore failed with unexpected error –1216
  • Event ID 9519: The following error occurred while starting database <databasename>: 0xfffffb40. Failed to configure MDB.

 

Let’s look at an example of the impact caused by running offline defrag against a database that is replicated in a DAG.

Defragmenting Exchange 2010 DAG Database

We shall defragment database, DB01.  Our starting configuration has two copies of this database and all is currently running well.

Exchange 2010 DAG Database Starting Point

 

So let’s dismount DB01, and then validate that the two mailbox servers have the same GUID for DB01.  We are using ESEUTIL /MH to dump out the header from the database.

On the first mailbox server we see the Rand of 2733649.  The GUID is displayed in the ‘DB Signature’ line and is the 'Rand’ value.  Be sure to look at the correct signature as there is a signature for both logs and databases.  It is expected that the Rand in these two lines will be different.

Exchange 2010 Database GUID = 2733649

 

On the second mailbox server we see the same Rand of 2733649, you can see the server name in the title bar of the PowerShell window.

Exchange 2010 Database GUID Same On Second Database Copy = 2733649

We have shown that the same database is present on both servers, i.e. both copies have the same Rand of 2733649.

Let’s now defragment DB01 on the first server, then see what happens……

Exchange 2010 Offline Defragmenting DAG Database

Then let’s check the Rand to see if the old value of 2733649 is still present:

 

Exchange 2010 Database GUID = 143007541

Nope, It’s not.  The Rand is now 143007541.  That shows that this is a different database.  Same name, but this is a different database.

Trying to activate the database copy on another server will create a sea of red in the application event log.  You will receive the errors listed above, and the most descriptive is Event ID 4807:

Active Manager Operation Failed Due To Offline Defrag

 

Recovering From Defragmenting DAG Database

At this point since the databases are no longer copies of one another we will have to re-seed the copy of the database.  Depending upon database size, disk throughput and network capacity this can take an extended period of time.  Let’s use PowerShell to re-seed the database copy:

Update-MailboxDatabaseCopy –DeleteExistingFiles –Identity DB01\Consea-MB2

 

Exchange 2010 Re-Seeding Database Copy Using PowerShell

This will have to be repeated for all database copies of the database in question.  If there are multiple copies over a WAN link then it would be a good idea to manually specify the seeding source using the –SourceServer switch.  That way one copy can be seeded over the WAN, and other copies can then use that as a  local source, thereby minimising WAN traffic and decreasing time.

Note that there are multiple options worth checking out with Update-MailboxDatabaseCopy.  They include options to explicitly choose a network, encryption and compression.  Chances are if you used Exchange 2010 RTM then you are quite adroit at using the –CatalogOnly switch!

 

When the seeding task completes, we can check that the database copies are OK

Checking Database Copy Status In Exchange 2010 PowerShell

Checking the Rand on the updated copy of the database, we can see that it has been updated and now has the same Rand which was generated by the defrag, 143007541. 

After Re-Seed Database Copy Has Updated Database GUID

 

Having to take a database offline for hours to defragment, and then manually reseeding all of its database copies is pretty painful.  Is there a better way to do this?

There certainly is!

A New Hope

Since Exchange 2010 introduced the online mailbox move feature, it is now pretty seamless to perform mailbox moves to a new mailbox database and when the old database is empty, simply delete it!  This process can be made even better with use of the SuspendWhenReadyToComplete parameter.  As an example:

New-MoveRequest -Identity 'User-21' -TargetDatabase DB01   –SuspendWhenReadyToComplete

This copies the vast majority of the mailbox content and then pauses.  The administrator will manually resume the move request using  Resume-MoveRequest.  So this means we can copy mailbox content through the day with no user impact.  After hours the suspended move can then be rapidly completed.  This has to be one of my favourite Exchange 2010/2013 features!

The same logic can also be applied to a mailbox database that must be evacuated for other reasons.  This may be necessary if file system AV has scanned the database as it will be in an unknown and thus unsupported state.

 

Note that the Mailbox Replication Service (MRS) is throttled, and if you wish to apply a little accelerando to the move process then you will need to take a look at the throttling configuration.

 

Cheers,

Rhoderick

>>>

 

Can You Help Us?  -- Yes !

If you would like to have Microsoft Premier Field Engineering (PFE) visit your company and assist with the topic(s) presented in this blog post, then please contact your Microsoft Premier Technical Account Manager (TAM) for more information on scheduling and our varied offerings!

If you are not currently benefiting from Microsoft Premier support and you’d like more information about Premier, please email the appropriate contact below, and tell them you how you got introduced!

US

Canada

For all other areas please use the US contact point.





Comments
  • Rhod, great article as always. In a follow-up article I would love to see comments as to how long a db defrag can take and why trying to reclaim white space is often a fool's errand. Cheers and take care...sc

  • Howdy Sean!

    Very true, I can't recall when I last did an offline defrag to reclaim space.  Last time would have been many, many moons ago!  

    Say hello to Mr  Thiessen for me please :)

    Cheers,

    Rhoderick

  • Got a small query here regarding seeding...

    Consider 2(P) + 1(DR) copies of avg DB size 500GB.

    Copy/Replay status seems to be normal i.e. 0. Suddenly something went wrong like log file got missed, ending up in  disk space issue(P).

    Action plan - you now you cannot either activate nor truncate the logs.

    Dismounted the DB(P Active) moved the logs because it was healthy & mounted.

    Q - having the passive copy node *.edb file(500GB) can we use this as incremental updating the DB instead of complete DB seeding between nodes in P & DR site.

    I tried with –SeedingPostponed / -Force switch but no luck - does it works really...or there is no other option but to complete reseed...?

  • Hi Charles,

    Missed this over the long holiday weekend.    

    Did you do what I have above and dump out the databases headers with ESEUTIL /MH ?  

    What I expect you to find is that on the active instance, the database is updated to use the new log stream but on the passive that copy of the database file is still looking for the original log stream GUID.  Since it does not match the passive DB will not attach to the log stream.

    Does that match your observed behaviour ?

    Cheers,

    Rhoderick

  • So you are right about the new log stream as it will not attach but is there any way we could use the same passive old DB for incremental seeding instead asking system to deleteexistingfiles and seed.

  • Don't think so but, let me have a think about that please.  Things are a bit hectic at the mo, and I'll reply when I get a chance.

    Cheers,

    Rhoderick

  • Hmn I understand - Not able to figure out any workaround but will hope anything to come up in future may be seeding via WAN to DR wouldn't be feasible for DB size from 500GB-2TB

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
Post Comment Fixer