Update 3/31/2011: The updated Exchange 2007 SP3 RU3 has been released. See Announcing the Re-release of Exchange 2007 Service Pack 3 Update Rollup 3 (V2).
3/30/2011: We posted a status update for this issue. See Exchange 2007/2010 Rollup 3 Status Update.

Over the weekend, the Exchange Product Group was made aware of an issue which may lead to database corruption if you are running Exchange 2007 Service Pack 3 with Update Rollup 3 (Exchange 2007 SP3 RU3). Specifically, the issue was introduced in Exchange 2007 SP3 RU3 by a change in how the database is grown during transaction log replay when new data is written to the database file and there are no available free pages to be consumed.

This issue is of specific concern in two scenarios: 1) when transaction log replay is performed by the Replication Service as part of ensuring the passive database copy is up-to-date and/or 2) when a database is not cleanly shut down and recovery occurs.

While only a small number of customers have been affected to date, we believe the risk is significant enough that we are recommending all customers to uninstall Exchange 2007 SP3 RU3 on all Mailbox Servers and Transport servers. Uninstalling the rollup will revert the system back to the previously installed version. We have also removed the Exchange 2007 SP3 RU3 download from the Microsoft Download Center and from Microsoft Update until we are able to produce a new version of the rollup.

We are actively working this issue and based on test results plan to release an updated version of Exchange 2007 SP3 RU3 to the Download Center later this week. In addition, we are conducting an internal review of our processes to determine how to prevent issues such as this in the future.

When this issue occurs, the following similar events are logged in the Application Event log of the Mailbox server. Regardless of whether you see these types of events, you should review the recovery instructions and begin that process. If you are uncomfortable performing any of these steps please contact Microsoft Support for assistance.

  • Event ID: 454
    Event Type: Error
    Event Source: ESE
    Event Category: Logging/Recovery
    Description: Microsoft.Exchange.Cluster.ReplayService (12716) Recovery E20 SG1\DB1: Database recovery/restore failed with unexpected error -4001.
  • Event ID: 2095
    Event Type: Error
    Event Source: MSExchangeRepl
    Event Category: Service
    Description: Log file D:\logs\SG1\E200006AFAE.log in SG1\DB1 could not be replayed. Re-seeding the passive node is now required. Use the Update-StorageGroupCopy cmdlet in the Exchange Management Shell to perform a re-seed operation
  • Event ID: 2097
    Event Type: Error
    Event Source: MSExchangeRepl
    Event Category: Service
    Description: The Microsoft Exchange Replication Service encountered an unexpected Extensible Storage Engine (ESE) exception in storage group 'SG1\DB1'. The ESE exception is a read was issued to a location beyond EOF (writes will expand the file) (-4001) ().

In addition, in environments utilizing Continuous Replication, comparison of the database file between the active and passive nodes will indicate that the database file has decreased in size.

Regardless of whether you are experiencing this issue, we strongly recommend taking the below actions to ensure that you do not experience any data loss or outage event associated with this issue.

For example:

  • If you have deployed your Mailbox servers utilizing Cluster Continuous Replication (CCR), failure of the active copies may affect your service SLA as you may have no viable passive copies to activate. Hardware failures may result in you not having a means to recover up to the point of failure and thus may experience data loss.
  • If you have deployed your Mailbox servers utilizing Single Copy Clusters (SCC), switchovers or failovers may result in this issue as there is only one copy of the database and recovery is performed during switchovers and failovers.

For environments leveraging CCR and/or Standby Continuous Replication (SCR)

If you note the listed events in your environment the following steps must be taken in order to restore your high-availability configuration:

  1. Rollback the CCR Mailbox server hosting the passive database copies and any SCR target Mailbox servers to the previously installed version (e.g., Exchange 2007 SP3 RU2) by uninstalling RU3.
  2. Re-seed all affected database copies on the CCR Mailbox server and any SCR target Mailbox servers hosting the passive database copies.
  3. Verify the database copy status is healthy for all passive copies.
  4. Perform a switchover and rollback the remaining CCR Mailbox server to the previously installed version (e.g., Exchange 2007 SP3 RU2).

If you are not seeing these events in your continuous replication enabled environment, we recommend the following steps:

  1. Rollback the CCR Mailbox server hosting the passive database copies and any SCR target Mailbox servers to the previously installed version (e.g., Exchange 2007 SP3 RU2) by uninstalling RU3.
  2. Perform a switchover and rollback the remaining CCR Mailbox server to the previously installed version (e.g., Exchange 2007 SP3 RU2).

For environments leveraging Single Copy Clusters (SCC)

  1. Rollback passive nodes within the SCC environment to the previously installed version (e.g., Exchange 2007 SP3 RU2) by uninstalling RU3.
  2. Perform a switchover and rollback the remaining SCC Mailbox server nodes to the previously installed version (e.g., Exchange 2007 SP3 RU2).
  3. If you have any databases that will not mount as a result of the above issue, you can restore the damaged databases leveraging a last known good backup.

For environments leveraging standalone Mailbox (or Public Folder) servers

  1. Rollback the standalone Mailbox servers to the previously installed version (e.g., Exchange 2007 SP3 RU2) by uninstalling RU3.
  2. If you have any databases that will not mount as a result of the above issue, you can restore the damaged databases leveraging a last known good backup.

For Hub Transport and Edge Transport servers

  1. Rollback the standalone transport servers to the previously installed version (e.g., Exchange 2007 SP3 RU2) by uninstalling RU3.
  2. If any transport servers have mail.que databases which currently do not mount as a result of the above issue, you can recover them by following the steps in Working with the Queue Database on Transport Servers.

Kevin Allison
GM Exchange Customer Experience