Data and Backup Compression in SQL 2008 (Part 2)

 

After my follow-up post about our SQL Server 2008 event in TVP, where I've presented about SQL Server 2008 Data and Backup compression (well, actually, I did made another last minute presentation on behalf of a friend on SQL 2008 Upgrade :-)), I've been asked to post  my slide deck on SQL Server 2008 Data and Backup Compression.

Unfortunately, this is not feasible, therefore I thought to post some of the information I've given during the session.

 

Compression in SQL Server comes in different flavours, and as I've realized during my presentation very few people (only 1 out of I believe 62?) know that we have already some kind of data compression which was released together with SQL Server 2005 Service Pack 2, called Vardecimal Storage Format (https://technet.microsoft.com/en-us/library/bb508963.aspx).

 

 This type of compression allows you to store decimal and numeric data in the variable portion of the row instead that in the fixed portion.

 

The compressio ratio you are going to achieve depends on your specific data, but is easy to evaluate in advance by using a simple store procedure:

 

sys.sp_estimated_rowsize_reduction_for_vardecimal ‘<table>‘

 

Before we move on to the options available in 2008, let me summarize few points that apply both to 2005 SP2 compression and 2008 Compression:

 

  • As said for 2005, the compression ratio you are going to achieve depends on your specific data, but both 2005 and 2008 come with a stored procedure which will allow yourself to estimate which ratio you are going to achieve
  • Compression allows to reduce the disk space required for storing your data but by using compression you will see as well:
    • an improvement in memory utilization
    • better performances for I/O bound workloads
  • On the other hand, you will have to pay all these benefits through CPU utilization, therefore in most of the cases you will see a small amount of CPU degradation
  • All compression types are only available in Enterprise edition
  • Application do not need to be changed (data is presented to applications as before)

 

OK So what about SQL Server 2008 then?

 

2008 comes with two new types of compression:

 

  1. ROW Compression (https://msdn.microsoft.com/en-us/library/cc280449(SQL.100).aspx) is very similar to Vardecimal Storage Format in SQL 2005 SP2 but the data-types that you can compress are more (basically all the types that are based on numeric so also datetime and money).

 

Vardecimal storage format is still supported in 2008 however, because row-level compression achieves the same goals, the vardecimal storage format is deprecated.

 

  1. PAGE Compression (https://technet.microsoft.com/en-us/library/cc280464(SQL.100).aspx) which includes Row Compression plus two additional compression algorithms, called
    • column prefix and
    • page dictionary.

 

Without going into the details of the two algorithms, (which if you are interested you can find on the SQL Server Storage Engine Blog - Data Compression Series) what you need to know is that while ROW compression looks at a single piece of data at a time, and compress it, the two other algorithms included with PAGE compression look at 'recurrences' in the column\page, at a byte level, therefore you are going to achieve great compression ratio every time your bytes are repeated in your page (bytes, not data).

 

As usual, you have a stored procedure to estimate the compression ratio:

 

sp_estimate_data_compression_savings
      [ @schema_name = ] 'schema_name' 
     , [ @object_name = ] 'object_name'
    , [@index_id = ] index_id
     , [@partition_number = ] partition_number
    , [@data_compression = ] 'data_compression'
[;]

 

Another thing worth knowing is that you have control over the objects you want to data-compress: therefore you can decide if you want to compress one single index (and leave the others un-compressed) or a single partition.

 

What about SQL Server 2008 Backup Compression? I bet this will be one of the major driver for the adoption of SQL 2008 as I know that most of the customer were looking forward for such a feature already since 2005.

 

Well, obviously Backup Compression will save the storage space required for your data (whoooooop!) and we are talking of compression ratios of 5:1, therefore completely comparable to most 3rd party software you know on earth :-).  Official data is not available yet, but I will post some info as soon as I will get hold of it!

 

Interesting fact is that actually, the run time of your backup\restore operation is going to decrease. If you want some numbers, have a look to this article from Paul Randal (SQL Server 2008: Backup Compression CPU Cost ) which exposes some interesting results from some tests done on our lovely AdventureWorks database!

 

Well, this was a very long post to write so I hope to have included most of the relevant information re. Compression in SQL Server 2008.

If you have any question, just drop me a comment and I will reply!

 

- Beatrice Nicolini -