Tim McMichael

Navigating the world of high availability...and occasionally sticking my head in the cloud...

Backups fail due to consistency check failure…

Backups fail due to consistency check failure…

  • Comments 7
  • Likes

Last week I had the opportunity to work with a customer who was experiencing issues backing up their Exchange 2010 databases.  The issue they experienced though is relevant to both Exchange 2007 and Exchange 2003 installations (that leverage VSS based backups and consistency checking enabled).

 

After reviewing the logs it was apparent that the VSS process was functioning appropriately.  All relevant events regarding the snapshot process were present.  In this case the backup job was configured for consistency check, and relevant consistency check events were noted.  In almost all backup jobs the following error was present in the logs:

 

Log Name: Application
Source: Storage Group Consistency Check
Event ID: 403
Task Category: Termination
Level: Error
Keywords: Classic
Description:
Instance: The physical consistency check successfully validated 0 out of xxxxxxxx pages of database 'DATABASE'. Because some database pages were either not validated or failed validation, the consistency check has been considered unsuccessful.

 

In general this event would indicate that consistency check encountered an error when scanning the pages of an Exchange database.  In most cases this would mean that there is page level corruption in the database such that the validation checks performed by consistency check would fail and the backup would be terminated.  This is by design.

 

In theory corruption of this type would not be present in the environment configured.  The customer was utilizing a Database Availability Group which has protections in it to self heal databases from this type of corruption.  Replication was healthy and there were no indication that any page corrections were performed.

 

If you look at the event in greater detail you will see that it provides the number of pages that were successfully scanned before the issue occurred.  When reviewing the application logs it was noted that on the same database the failure occurred after scanning a different number of pages.  For example, in one failure the failure occurred after scanning 28000 pages and another failure 42456 pages.

 

At this point when reviewing the system log the following error was noted:

 

Time:     1/9/2012 12:40:56 PM
ID:       36
Level:    Error
Source: volsnap
Machine:  server.company.com
Message:  The shadow copies of volume F: were aborted because the shadow copy storage could not grow due to a user imposed limit.

 

This error would imply that while attempting to store differential changes while the snapshot existed the allotted snapshot storage space was exhausted and could not be grown.  When reviewing vssadmin list shadowstorage it was noted that the shadow storage space assigned to the volume hosting the database was 321 megabytes.

 

vssadmin list shadowstorage

Shadow Copy Storage association
   For volume: (F:)\\?\Volume{0ecc7a68-be78-4c40-baf6-4d0d3b0b6693}\
   Shadow Copy Storage volume: (H:)\\?\Volume{ed074b1d-b500-465b-a720-d2f733f49761}\
   Used Shadow Copy Storage space: 0 B (0%)
   Allocated Shadow Copy Storage space: 0 B (0%)
   Maximum Shadow Copy Storage space: 321 MB (0%)

 

This is an extremely small shadow copy storage space.  By default the allotted space is generally 10% of volume size.  To correct this issue we can utilize the vssadmin command in order to reset the shadow storage space.

 

vssadmin Resize ShadowStorage /For=F: /On=F: /maxsize=20%

Successfully resized the shadow copy storage association

In our case the in-ability to continue to store differential changes in the shadow storage space caused the shadow copy to be removed.  This subsequently caused consistency check to fail resulting in a failure of the backup job.  Once the shadow copy storage was was allocated to an appropriate size, and differential changes could be successfully stored for the entire duration of the backup operation, the backups proceeded successfully.

Comments
  • Very useful information , thanks Tim..

  • Great Work!

  • @Karthik:

    I'm glad you found it useful.

    TIMMCMIC

  • @Mani..

    Thanks

    TIMMCMIC

  • Tim, firstly thanks a lot for your explanation of this issue. I have reproduced this issue in my env. But I encounter the volsnap 33 event from the system log as below The oldest shadow copy of volume D: was deleted to keep disk space usage for shadow copies of volume D: below the user defined limit. Bty, I am using the Exchange 2013 + Netbackup 7.6 to reproduce this issue.

  • But I am not very know what you said: In our case the in-ability to continue to store differential changes in the shadow storage space caused the shadow copy to be removed. This subsequently caused consistency check to fail resulting in a failure of the backup job. I am not familiar with the Exchange writers + VSS process during the backup operation. Could you please show me more about the entire transaction flow of the Exchange writers + VSS activities while the backup operation such as Netbackup, TSM request the show copy service being created. Anyway, thanks a lot for your help in advance!

  • @Anonymous... That would imply that there was an existing snapshot on the volume or maybe a previously orphaned snapshot that could be removed. In this case there exists only one snapshot - the snapshot that is in progress - and it cannot be expanded due to reaching the user defined limit. TIMMCMIC

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment