Tim McMichael

Navigating the world of high availability...and occasionally sticking my head in the cloud...

Exchange 2010: Log Truncation and Checkpoint At Log Creation in a Database Availability Group

Exchange 2010: Log Truncation and Checkpoint At Log Creation in a Database Availability Group

  • Comments 21
  • Likes

In previous versions of Exchange, when a backup was completed, almost all log files prior to the current log file were truncated from the system.  Administrators monitoring the directory would originally see many logs, and post backup note that only a few logs remained.  In Exchange 2010 Service Pack 1 and later administrators note that multiple log files remain on the disk post backup or the appearance that no log files have truncated at all.  In many cases this leads to a belief that logs are actually not truncating successfully or that there is an issue with backups.

 

Why do we see logs remaining on disk for longer?  Exchange 2010 SP1 and newer introduces a change in the behavior of log truncation.  The changes were taken to ensure that replicated copies of databases within a database availability group always had the appropriate log files on the source server to complete an incremental resynchronization. 

 

The change to log truncation is the tracking of Checkpoint At Log Creation.  Remember that in a database availability group we can expect the checkpoint to be approximately 100 logs (or slightly more) off the current log file – this is known as checkpoint depth.  As Exchange creates new log files we stamp into the header of the new log files what log file the checkpoint was pointing at when the current log was created.  For example, let us say that log file 0xA679 (42617) was just created as the current ENN.log.  We can expect that the checkpoint at log creation value stamped within the header of this log file would be approximately 0xA16 (42517).  You can see the checkpoint at log creation value by using eseutil /ml <logfilename> to dump the header of a log file.

 

[PS] P:\DAG\DAG-DB0\DAG-DB0-Logs>eseutil /ml .\E020000A67E.log

Extensible Storage Engine Utilities for Microsoft(R) Exchange Server
Version 14.02
Copyright (C) Microsoft Corporation. All Rights Reserved.

Initiating FILE DUMP mode...

      Base name: E02
      Log file: .\E020000A67E.log
      lGeneration: 42622 (0xA67E)
      Checkpoint: (0xA679,8,0)
      creation time: 03/11/2012 06:00:48
      prev gen time: 03/11/2012 04:01:17
      Format LGVersion: (7.3704.16.2)
      Engine LGVersion: (7.3704.16.2)
      Signature: Create time:05/02/2010 18:04:08 Rand:399094376 Computer:
      Env SystemPath: d:\DAG\DAG-DB0\DAG-DB0-Logs\
      Env LogFilePath: d:\DAG\DAG-DB0\DAG-DB0-Logs\
      Env Log Sec size: 512 (matches)
      Env (CircLog,Session,Opentbl,VerPage,Cursors,LogBufs,LogFile,Buffers)
          (    off,   1027,  51350,  16384,  51350,   2048,   2048,  29487)
      Using Reserved Log File: false
      Circular Logging Flag (current file): off
      Circular Logging Flag (past files): off
      Checkpoint at log creation time: (0xA679,8,0)
      1 d:\DAG\DAG-DB0\DAG-DB0-Database\DAG-DB0.edb
                 dbtime: 18078306 (0-18078306)
                 objidLast: 2957
                 Signature: Create time:05/02/2010 18:04:08 Rand:399127765 Computer:
                 MaxDbSize: 0 pages
                 Last Attach: (0xA348,9,86)
                 Last Consistent: (0xA346,9,B5)

      Last Lgpos: (0xa67e,252,0)

Number of database page references:  770

Integrity check passed for log file: .\E020000A67E.log


Operation completed successfully in 0.265 seconds.

 

In the previous example the checkpoint at log creation is 0xA679.

 

Within a DAG all servers that contain a replicated copy of a database report the maximum log file that is eligible for truncation.  These values are reported to the active node which subsequently calculates the maximum log file for truncation.  In simplest terms the following process occurs:

 

  • Passive copy on Node-2 reports OK to truncate log 0xA679 (42617).
  • Passive copy on Node-3 reports OK to truncate log file 0xA678 (42616)
  • Passive copy on Node-4 reports ok to truncate log 0xA679 (42617).
  • The active node determines that the best log file eligible for truncation based on the passive copies is 0xA678 (42616).  [This is essentially the minimum of all reported OK logs to truncate.]
  • The active node then looks at the checkpoint at log creation of 0xA678 (42616) and determines that value is 0xA614 (42516).  In this example that would be 100 logs off the best log reported for truncation of the passive copies.
  • The active node sets the truncation point to be log 0xA614 (42516).
  • Therefore after a successful backup logs prior to 0x614 (42516). would truncate.

 

This essentially means that 100 additional logs that would have previously truncated prior to this change do not truncate.

 

Taking into account checkpoint at log creation administrators can better understand how log files are truncated and why log files remain on disk after a backup that might have in prior versions been truncated.

 

============================

Update 5/16/2012

Corrected hex conversions in example.

============================

Comments
  • Dec 42516 is Hex A614 and not A15 - at least using windows 7 calculator :)

  • @Andreas:

    Thanks - problem between keyboard in chair.

    I fixed it up and updated it online.

    TIMMCMIC

  • We can expect that the checkpoint at log creation value stamped within the header of this log file would be approximately 0xA16 (42517).

    42517 - 0xA615 in Hex.

  • With Exchange Exchange 2010 SP3 we have seen the amount of Logfiles that stay after successfull back on A DAG around 200+. Prior to SP3 this may have been 100+. This is also confusing and seeme to be undocumented on MS side which clearly makes a lot of trouble on the market. There are hundres of threads and blogs who describe this and tend to go in direction of search indexer/Catalog files blocking.

    What Software you finally use Windows Backup, Netapp Snap Manager for Exchange or any Exchange Aware Backup seems to have no influence on the amount of logfiles the DAG keeps even after backup. Make sure you have enough Test data in your Test mailbox (Attachments).

  • @Michael:

    To my knowledge there is no difference between the versions documented here and SP3.  Also - as far as I know - there is no correlation between the number of items that the index is backlogged and how we truncate log files.  

    There should be no expectation that the backup solution has any impact on how logs are truncated, as the information store / replication service is what's responsible for determining log file truncation.

    TIMMCMIC

  • Hello Michael,

    First of all, thank you for such detailed and interesting article!

    I have a question. Is there any way to manipulate with Checkpoint At Log Creation time? Is it possible to influence on its value?

    Some of our customers have problems after backup, because there copies sporadically become FailedAndSuspended with "Log file 1 missing" error. Since full backup is intended to truncate old log files, there should be such ancient "log file 1", because it was truncated long time ago. The behavior repeats sporadically, not always, on different databases. MS Partner Support does not what to do with it, and the backup software vendors also give up finding a solution for the problem. Fortunately, it is possible to resume the FailedAndSuspended copy manually in EMC. But we need to find a solution to get rid of the problem and routine to manually resume the copy every time it fails after a backup.  

  • Hello, Tim .

    Do you know if CACL and GTP still on Exchange 2013 ?

  • Yes.

    TIMMCMIC

  • srs

  • If the backup interrupted ,After a successful snapshot, It truncating the logs, If we set backupSucceeded as false. What may be the problem ?.

  • @Anonymous... I'm not sure what you are asking. If a backup is interrupted for any reason, log file truncation is not allowed to occur. TIMMCMIC

  • In a pre-prod environment we have many dbs that don't yet have active user mailboxes, just health mailboxes. Logs are generated every 15 mins exactly. In these cases the difference between the passive copy's "ok to truncate" and the active copy's is often times hundreds of logs. So after truncation you could have 300-500 or more logs remaining that are days old. Is that due to the infrequency that the logs are generated?

  • @MB...
    Yes - this sounds about right. When I first encountered this issue was in a customers test environment where they were generating a handful of log files but saw no significant truncation.

    TIMMCMIC

  • d

  • thanks...

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment