Replication: Transactional Replication Performance Issue

  • one possible cause...

  • There are many factors that can influence the performance on a transactional replication topology. I would like to address one that is not so commonly known.

  • Problem:

  • I had worked on an issue in which we were experiencing performance latency. Once the latency was pinpointed, and in this case it was between the DISTRIBUTOR and SUBSCRIBER, we dived into the matter further and found the following:

  • Environment:

  • This applies to both SQL 2000 and SQL 2005.

  • Push Transaction Replication

  • Analysis:

  • The main issue we saw was that the article was added with @status = 8.

  • In BOL, this is explained by the following:

  • [ @status= ] status

  • Specifies if the article is active and additional options for how changes are propagated. status is tinyint, and can be the | (Bitwise OR) product of one or more of these values.

  • Value

    Description

    1

    Article is active.

    8

    Includes the column name in INSERT statements.

    16 (default)

    Uses parameterized statements.

    24

    Includes the column name in INSERT statements and uses parameterized statements.

    64

    Identified for informational purposes only. Not supported. Future compatibility is not guaranteed.

  • For example, an active article using parameterized statements would have a value of 17 in this column. A value of 0 means that the article is inactive and no additional properties are defined.

  • https://technet.microsoft.com/en-us/library/ms173857.aspx

  • Once we change this value to 24, then the performance for PUSH increased.

  • Explanation:

  • SQL to SQL replication should always be encouraged to use ‘parameterization’ (which is the default) with stored procedures. This has not changed from SQL2000 to SQL2005.

  • By default it is parameterized, meaning column values read from log records are kept in raw binary format without conversion to Unicode.

  • Insert / Update / Delete commands in MSrepl_commands table are constructed in a way that can not be viewed easily (that’s why we have sp_replshowcmds and sp_browsereplcmds).

  • Distribution agent knows the format and it extracts each parameter with type and size info. This is most efficient and should be used by default unless there is specific reason (e.g. certain type binding does not work on heterogeneous subscribers).

  • ‘String literal’ format means we convert every column value into Unicode string and the commands in MSrepl_commands table are stored as flat Unicode string. This means you can view it easily compared to the above. You can see how this has a lot more overhead compare to parameterized format.

  • Parameterized or ‘string literal’ is an option in sp_addarticle specified by @status.

  • If you do a sp_helparticle or look at sysarticles you should be able to see the value of column status (or eventually you generate the creation script of the article to get the sp_addarticle statement)

  • Tests:

  • When using status=8, we have: 5000 single updates running like this:

  • <example>

  • exec sp_executesql N'update [dbo].[TableName] set [Year] = 2014 where [ColumnName] = 1 and [ColumnName] = @P1',N'@P1 nvarchar(2)',N'01'

  • Duration = 63.387.350, sum for all the 5000 statements

  • When using status=24, we have 5000 single updates running. These are looking like this:

  • exec [sp_MSupd_dboTableName] default,default,2011,default,default,default,default,default,default,default,default,default,default,default,default,

  • default,default,default,default,default,default,default,1,'01',0x040000

  • Duration is 2.256.404, sum for all the 5000 statements

  • <example 2>

  • SQL Server 2005 – Push Replication

  • 2000 commands 2000 transactions

  • @status = 16 and @status = 24

  • Num Trans/sec:            33.68

  • Num Cmd/sec:              33.68

  • SQL Server 2005 – Push Replication

  • 2000 commands 2000 transactions

  • @status = 8

  • Num Trans/sec:            3.24

  • Num Cmd/sec:              3.24