One of the great new features of SQL 2005 is database snapshots. With one simple transact SQL statement, you can provide your users with a read-only, static copy of their data. For example, you can take a snapshot on the last day of the month and users can do month-end reporting without worrying about changes to the underlying data. That’s pretty powerful. But did you know that snapshots can also be used to recover to a point in time? That is the subject of this article – how to use database snapshots to significantly reduce your recovery time in certain emergency situations.
As DBA’s, we spend a lot of time worrying about how to protect our data from disaster. There are really 2 goals here:
1. Amount of acceptable data loss.
2. Amount of acceptable downtime.
On the higher end of acceptable data loss we have full and differential backups. On the low end we have log backups, log shipping and finally mirroring where the potential loss is mere seconds. Pretty cut and dried.
But what about the down TIME involved? With mirroring you can fail over to a hot standby rapidly and automatically, thus achieving the smallest possible recovery time. This protects your data against physical disaster but does nothing to protect you from data corruption, because the mirror simply replicates the data “as is”.
To roll back bad data, you’re back to good old full, differential and log backups. For small databases this can be pretty simple. But it is often meticulous work, and if you restore one too many logs, you’ll have to start all over. For large databases, the recovery time can be very long. If you’re in this business long enough you’ll experience losing the online backups and have to restore from tape. I’ve seen situations where this took 1-2 weeks!
This is where snapshots can really help. Since the database can be recovered DIRECTLY from a snapshot, without going through the full, differential and log backups, this gives us a way to be very granular with point-in-time recoveries. And the recovery is almost instantaneous. So changes can be rolled back with almost no downtime.
Here is how we use it in MSCOM. Let’s say I’m doing an application upgrade. Before starting, I create a snapshot. Now something goes wrong. The deployment doesn't pass "smoke" testing, for what ever reason you can imagine -- and we decide to roll back and start again. I just recover from the snapshot, and in seconds all the changes are gone. It’s like having an undo file, or running the whole deployment in a transaction. And it works for both DDL and DML changes.
I think you can see how powerful this can be. We recommend making this a regular part of your internal change controls. Having said that, there are a few points to consider:
Now for some sample code to get you started:
set nocount on
declare @sql_string varchar (4000)
set @sql_string = 'Create Database ' + db_name() + '_Recovery_SS ON '
select @sql_string = @sql_string +
'(Name = ''' + name + '''' + ', FILENAME = ''' +
substring(filename, 1, len(filename)-3) + 'ss''),'
where groupid <> 0
select @sql_string = substring(@sql_string, 1, len(@sql_string)-1) + ' AS SNAPSHOT OF ' + db_name()
-- exec (@sql_string)
drop database <db-name_Recovery_SS>
RESTORE DATABASE <db-name> from
DATABASE_SNAPSHOT = <db-name_Recovery_SS>'
Here’s a test I did. Try it for yourself. First create a snapshot. Then in your original database, create a table, add 1 million rows, run sp_spaceused to prove they exist. Then restore from the snapshot and run sp_spaceused again to prove both schema changes and data are rolled back. On my system, the restore took 5 seconds:
create table Lots_Inserts (
Recno int identity,
declare @counter int
set @counter = 0
while @counter < 1000000
insert Lots_Inserts (Time, Data)
values ( getdate(), DATEPART(ms, GETDATE()) )
select @counter = @counter +1
-- Took 3 minutes, 30 seconds