Li Zhang (SQL)'s Web Logs

This web log is about replication and other SQL Server technologies.

  • Exciting SDET opportunities in SQL Server data replication and synchronization team, Microsoft Corporation

    We have a few Software Design Engineer Test openings in the data replication and synchronization team of SQL Server, Microsoft Corporation. Prospective candidates are encouraged to contact li.zhang@microsoft.com. The following are job descriptions.

     Description 1:

    Do you get excited about distributed systems and the challenges this domain presents? Do you want to work on deep and challenging technical problems that have real world applicability and long-term industry-wide implications? Would you like to be a part of the team building the next generation of replication and synchronization technologies for Microsoft? If you are excited about tackling problems in this space we want to speak with you. The Data Replication and Synchronization team in SQL Server has been tasked with delivering innovative synchronization and replication techniques that are core building blocks for data movement within an organization and/or across the Internet. Solutions will span from high-end enterprise customers to the disconnected mobile work force.

    The Data Replication and Synchronization test team is seeking highly qualified SDETs to work on replication improvements of the next release of SQL Server. The team is very dynamic with ample opportunities for personal career development. You will participate in product architecture and design reviews, analyze the customer requirements, develop feature test plans, work with developers and program managers to ensure proper test coverage, design and implement automated tests in C++, C# and/or transact-SQL, debug problems, develop solid infrastructure and new tools. Qualifications include solid understanding of SQL Server or other database management systems, three years experience of testing commercial software products, the ability to multitask, and good communication skills. Prior experience and knowledge of replication technology is preferred. Proficiency of programming in one or more high level languages, such as C, C++ or C#, is required. Proficiency in transact-SQL programming is a plus. A bachelor’s degree in Computer Science or related technical field is preferred.

     

     

    Description 2:

     

     

    The Data Replication and Sync team is working to make sure that you can have your important data available no matter what PC, Device, or Web Page you are using. Have you ever considered that as computer technology becomes more pervasive, users are going to be faced more and more with the problem of getting access to something that they have on another machine or device? Data synchronization is going to be the cornerstone technology for solving this problem. By having assets on the PC, the server, the cloud, and on devices, Microsoft is in a better position than any other company in the industry to provide a solution that will allow you to move your data wherever you need it. In the Data Replication and Sync team we are working on a project called Harmonica to do exactly that.

     

    Do you want to work in a startup environment? We’re moving very fast with short milestones to enabling us to align with customers and partners. In less than a year we’ve already deployed this technology to the Windows Live team that provided the end to end solution for Favorites synchronization. Do you want to work with cool groups? We’re working with teams in Windows Live, Zune, and Office just to name a few. Want to work on cool technology? We are driving the core metadata layer for data synchronization that enables sharing in a multi-master mesh fashion and delivering the toolkit that will allow other teams to use this metadata to solve their synchronization problems. It is inherently distributed. It is algorithmically focused. We have lots of hard problems that still need to be solved and we need smart passionate people to work on them.

     

    A successful candidate will have:

     

    · Excellent coding skills in C, C++, or C#

    · Great ability to work with other teams and people

    · Excellent problem solving skills

    · A proven track record for shipping quality software on time

    · The ability to work within product teams and across product groups

    · Passion for engineering excellence and learning

    · Three or more years of experiencing in testing commercial software product

    · A BS or MS in computer science or related field

     

  • Web Synchronization Frequently Asked Questions

    Q. Is Web Synchronization supported with push subscriptions?

    A. No. Web Synchronization is supported only for pull subscriptions.

     

    Q. Can I use Web Synchronization for a SQL Server 2000 subscription?

    A. No. Web Synchronization is supported only for SQL Server 2005 subscriptions.

     

    Q. Does Web Synchronization support Logical Records?

    A. No. Web Synchronization does not supported Logical Records feature of SQL Server 2005.

     

    Q. What features of SQL Server 2005 are not supported with Web Synchronization?

    A. Web Synchronization does not support Logical Records, ExchangeType 1 (upload only) and ExchangeType 2 (download only) merges.

     

    Q. Is Web Synchronization supported if I run my IIS 6.0 server in IIS5.0 isolation mode?

    A. Yes, IIS can either be running in 6.0 mode or 5.0 isolation mode. Web Synchronization does not depend on the running mode of IIS server.

     

    Q. Can I use the Configure Web Synchronization wizard to create a virtual directory for SQL Server 2005 Mobile Edition subscribers?

    A. Yes, Configure Web Synchronization wizard can be used to create virtual directories for SQL Server 2005 or for SQL Server 2005 Mobile. For SQL Mobile virtual directory creation, the SQL Server 2005 Mobile Edition Server Tools need to be installed.

     

    Q. Does it mean that all subscriptions will start using Web Synchronization by setting the publication property @allow_web_synchronization=’true’?

    A. No, setting this property enables the publication to use Web Synchronization. As needed, subscribers can use Web Synchronization.

     

    Q. I do not find the Configure Web Synchronization menu item

    A. This context menu is based off of the merge publication. Open Object Explorer, Connect to the local server, Expand the Replication node and expand the Local Publications node. Right Click on (any) merge publication and select Configure Web Synchronization. Additionally you can create a virtual directory using a script that is packaged along with the SQL Server replication samples.

     

    Q. My publisher and IIS are 2 different machines. I do not have any publications on the IIS machine. How do I configure the virtual directory?

    A. On your IIS machine, open SQL Server Management Studio and connect to your publisher instance of SQL Server. Then follow the steps mentioned in the answer to the above question

     

    Q. My publisher and IIS are 2 different machines. I have installed only connectivity components on the IIS machine and it does not have a full SQL installation. How do I configure the virtual directory?

    A. Same as the above answer

     

    Q. I am creating subscription using RMO and wondering whether the subscriber needs to be on the same network as the publisher because the code to create the subscription makes a call the Publisher DB and I am not sure if this will work over the internet.

    A. There are multiple ways in which this can be resolved:

    ·       Have the subscriber in the same network and create a pull subscription. Now
     there is publisher connection and subscription can be created successfully. There after the subscriber can synchronize over the internet with no direct publisher connection.

    ·       If you do not want to or cannot have the subscriber in the same network as the publisher, then you will need to create an anonymous subscription and that will succeed. To set the anonymous subscription using RMO, here is an example:
    MergePullSubscription mergepullsub = new MergePullSubscription("subdb", pubserver, "pubdb", "publication", subconn)
    mergepullsub.Create();
    MergeSynchronizationAgent syncAgent = mergepullsub.SynchronizationAgent
    mergepullsub.SubscriptionType = SubscriptionOption.Anonymous
    syncAgent.SubscriptionType = SubscriptionOption.Anonymous;
    syncAgent.Synchronize();

    But note that anonymous subscriptions cannot be used for republishing.

     

    Q. When I am trying to create a virtual directory using the Configure Web Synchronization wizard, I get this error:

    TITLE: Welcome to the Configure Web Synchronization Wizard

    ------------------------------

     

    Your Web Server does not have a certificate installed and cannot be configured for Secure Sockets Layer (SSL) communications.  You must first configure your Web Server for SSL before continuing.

    ------------------------------

    BUTTONS:

    OK

    ------------------------------

    A. The reason for this error is that a certificate is not installed which is needed for configuring SSL. A certificate can either be acquired from a Certificate authority or for testing purposes one can be generated and installed from the below locations:

    For IIS6.0: http://www.microsoft.com/downloads/details.aspx?FamilyID=56fc92ee-a71a-4c73-b628-ade629c89499&displaylang=en

    For IIS5.0: http://support.microsoft.com/kb/228984

     

    Q. Can I modify a virtual directory after creating it?

    A. Yes. A virtual directory created using the Configure Web Synchronization wizard can be modified using the same wizard. One needs to select the “Configure an existing virtual directory” on the wizard.

     

    Q. I have 2 IIS servers. Can I use them both for Web Synchronization?

    A. Yes, you can. You will need to create separate virtual directories on both the machines and let your subscribers connect to any of these IIS servers for synchronization.

     

    Q. Does that mean I have to run the Configure Web Synchronization multiple times?

    A. Yes, you will need to run Configure Web Synchronization once on each of the IIS machines.

     

    Q. On the Web Synchronization wizard page of Directory Access, what is the check box “Grant the above users permissions to access the UNC snapshot share” for?

    A. If you are using UNC share for snapshot folder, then the user account that is going to run the synchronization (and/or the –InternetLogin account) needs to have at-least read permissions on the snapshot files in the UNC share for applying snapshot. This page of the wizard is the place where you would want to add all such users who will run the synchronization or be used in –InternetLogin so that they have can run initial synchronization successfully.

     

    Q. I don’t know that snapshot share. Can I skip the “Snapshot Share Access” page for now?

    A. Yes, you can skip this page by un-checking the “Grant the above users permissions to access the UNC snapshot share” checkbox. However, you will either need to grant read access to the snapshot share manually to the user account or re-run the Configure Web Synchronization again to provide appropriate permissions.

     

    Q. When I try to access https://myMachine.myDomain.com/myVirtualDirectory/replisapi.dll, I get a successful message. But when I try https://myMachine.myDomain.com/myVirtualDirectory/replisapi.dll?diag, I get “Access is Denied”.

    A. This is because the user running the ?diag or the InternetLogin used is not an admin on the IIS box. This is expected. However the synchronization is expected to pass.

     

    Q. When I run my synchronization I get “A security Error Occurred” error.

    A. This mostly has to do with a difference in the name of the IIS server you are using in your –InternetURL parameter for synchronization and the name of the machine on the certificate. Ensure that you are using the exact same name in -InternetURL as it is on the certificate. This includes any fully qualified domain name.

     

    Q. My synchronization fails with access denied.

    A. Try the following:

    ·   Does the user account running the synchronization or –InternetLogin have read permissions on the snapshot share?

    ·   Does the user account running the synchronization or –InternetLogin have read and execute permissions on replisapi.dll (residing in the virtual directory) on the IIS machine?

    ·   Is the user passed in -PublisherLogin in the Publication Access List (PAL)?

    ·   Does IIS allow anonymous authentication? If so, connections are attempted as the IUSR_<MACHINENAME> user which may not have the necessary permissions.

     

    Q. Why does my merge agent time out?

    A. Is there lot of data to be transferred from publisher to subscriber and while doing so, is the publisher server being taxed? If so, try rerunning the merge agent with higher values of –QueryTimeout, -LoginTimeout and –InternetTimeout (either in the agent command or the profile). Also increase the default connection timeout on the IIS server.

     

    Q. What port is my merge agent using to connect to the IIS server?

    A. By default port 443 is used to connect to the IIS server. So when running the Merge agent, you need not specify the port number. However if your IIS server is configured to use a non-default port, you need to specify it in the Merge Agent command line as: https://myMachine.myDomain.com:100/myVirtualDirectory /replisapi.dll

     

    Q. My merge agent fails and I am using Integrated authentication for Publisher/Distributor connection. I don’t know why?

    A. If you have a 3 machine setup with Publisher/Distributor, IIS server and Subscriber server all on 3 different machines and you are using integrated authentication, the merge agent can fail if Kerberos delegation is not enabled on the IIS server.

     

    Q. My merge agent fails with error messages indicating something is wrong with the Proxy setting.

    A. Are you using a proxy Server? If so you need to set it in Internet Explorer. Go to Internet ExploreràToolsàInternetOptionsàConnectionsàLANsettings. In the Proxy Server address text box, enter the proper proxy server and the port number in the port text box. If the local machine is hosting the IIS server for test purposes, it is recommended to check the box for “Bypass proxy server for local addresses” to avoid DNS lookup.

    Alternatively the Proxy server information can be set using the Merge agent parameters: -InternetProxyServer, -InternetProxyLogin and –InternetProxyPassword

    If you are not using proxy server uncheck this check box.

    If that does not solve your problem, then on the InternetExploreràToolsàInternetOptionsàConnectionsàLANsettings uncheck the Automatically Detect Settings check box and try your merge again.

     

    Q. I sometimes get an error message: “Header information is either corrupted or missing”

    A. Do you use ISA Server 2004 and have turned on the "Link Translation" feature?
    (for reporting services) If so, turn it off, and try is that helps to make the synchronization successful.

    Additionally verify that the IIS and the publisher side components are of the same version.

     

    Q. My merge agent when synchronizing fails, but I do not see the error in the Distributor or the Monitor?

    A. When using web synchronization the client does not connect to the distributor/publisher directly. It connects to the IIS server and the IIS server establishes a connection to the distributor/publisher. If the error is coming from the IIS server side, then the merge process has not yet connected to the publisher/distributor. Hence there won't be any errors logged in the distribution database. Check the merge agent console output, job details or websync.log file in your virtual directory on IIS server for errors.

     

    Q. Sometimes I get this error message: “The format of a message during Web synchronization was invalid. Ensure that replication components are properly configured at the Web server”

    A. Are you using a mix of .NET 1.1 and .NET 2.0 in your applications? If so, that can cause this error to happen. Try migrating your application to .NET 2.0 completely and see that fixes your problem.

  • Checking issues in merge replication filters

    Merge replication filters may affect replication performance significantly, if the filters are not designed properly. I have a script that will check the following issues in merge filters:

    • Checking existence of subset filter and join filters on the same article.
    • Checking existence of multiple join filters on the same article
    • Checking existence of circular join filters in articles
    • Finding deepest join filter chains of articles
    • Checking reference to un-published tables in article filters

    You can run it in a database enabled for merge replication. All rights reserved.

    set nocount on

    if DATABASEPROPERTYEX(DB_NAME(), 'IsMergePublished') = 0
      begin
        print 'Current database is not published for merge replication.'
        goto done
      end

    print ' '
    print '*****************************************************************************************************'
    print '* Checking existence of subset filter and join filters on the same article ...'
    print '*****************************************************************************************************'
    print ' '

    select 'pubname' = p.name, 'artname' = a.name
    into #multiplefilters1
    from sysmergearticles a, sysmergesubsetfilters f, sysmergepublications p
    where a.artid = f.artid
    and a.pubid = f.pubid
    and a.subset_filterclause is not null
    and a.subset_filterclause <> ''
    and a.pubid = p.pubid

    if exists (select * from #multiplefilters1)
      begin
        print 'The following articles have both subset filter and join filters. The logical relationship of these '
        print 'filters is OR and will be implemented as UNION in the article view. This may have performance impact.'
        print 'You may wish to consider re-implementing them.'
        print ' '
        select 'Publication Name' = pubname, 'Article Name' = artname
          from #multiplefilters1
        print ' '
      end

    drop table #multiplefilters1


    print '*****************************************************************************************************'
    print '* Checking existence of multiple join filters on the same article ...'
    print '*****************************************************************************************************'
    print ' '

    select f.pubid, f.artid
    into #multiplefilters2
    from sysmergesubsetfilters f
    group by f.pubid, f.artid
    having count(*) > 1

    if exists (select * from #multiplefilters2)
      begin
        print 'The following articles have multiple join filters. The logical relationship of these filters is OR '
        print 'and will be implemented as UNION in the article view. This may have performance impact. '
        print ' '
        select 'Publication Name' = p.name, 'Article Name' = a.name, 'Join Article Name' = f.join_articlename, 'Filter Name' = f.filtername
          from #multiplefilters2 m, sysmergearticles a, sysmergesubsetfilters f, sysmergepublications p
         where m.artid = a.artid
           and m.pubid = a.pubid
           and m.artid = f.artid
           and m.pubid = f.pubid
           and p.pubid = m.pubid
         order by 1, 2
        print ' '
      end

    drop table #multiplefilters2

    print '*****************************************************************************************************'
    print '* Checking existence of circular join filters in articles ...'
    print '*****************************************************************************************************'
    print ' '

    select distinct 'pubname' = p.name, 'artname' = a.name, f.join_articlename, 'level' = NULL
    into #multiplefilters3
    from sysmergesubsetfilters f, sysmergepublications p, sysmergearticles a
    where f.pubid = a.pubid
      and f.artid = a.artid
      and f.pubid = p.pubid

    declare @level int
    select @level = 0
    while @level = 0 OR @@rowcount > 0
      begin
        select @level = @level + 1
        update m3a
           set m3a.level = @level
          from #multiplefilters3 m3a
         where m3a.level is NULL
           and m3a.join_articlename not in (select artname from #multiplefilters3 m3b
         where m3b.level is NULL and m3a.pubname = m3b.pubname)
      end

    select @level = 0
    while @level = 0 OR @@rowcount > 0
      begin
        select @level = @level - 1
        update m3a
           set m3a.level = @level
          from #multiplefilters3 m3a
         where m3a.level is NULL
           and m3a.artname not in (select join_articlename from #multiplefilters3 m3b
         where m3b.level is NULL and m3a.pubname = m3b.pubname)
      end

    if exists (select * from #multiplefilters3 where level is NULL)
      begin
        print 'The following articles have circular join filters with other articles. This may have performance impact. '
        print ' '
        select 'Publication Name' = pubname, 'Article Name' = artname, 'Join Article Name' = join_articlename
          from #multiplefilters3
         where level is NULL
         order by 1, 2, 3
        print ' '
      end

    print '*****************************************************************************************************'
    print '* Finding deepest join filter chains of articles ...'
    print '*****************************************************************************************************'
    print ' '

    declare @max int, @min int, @start int, @end int, @pubname sysname, @artname sysname, @sequence int
    select @max = max(level), @min = min(level) from #multiplefilters3
    where level is not null
    if @max >= 0 - @min
      select @start = @max, @end = 1
    else
      select @start = -1, @end = @min

    declare #cursor_deepest_articles cursor for
    select pubname, artname from #multiplefilters3 where level = @start

    print 'The following artile sets have the deepest level join filters in the database. '
    print ' '

    select @sequence = 1
    open #cursor_deepest_articles
    fetch #cursor_deepest_articles into @pubname, @artname
    while @@fetch_status <> -1
      begin
        create table #articlechain (id int identity primary key, pubname sysname, artname sysname)
        insert #articlechain (pubname, artname) values (@pubname, @artname)
        if @max >= 0 - @min
          select @start = @max, @end = 1
        else
          select @start = -1, @end = @min

        while @start >= @end
          begin
            select @artname = join_articlename
              from #multiplefilters3
             where level = @start
               and artname = @artname
               and pubname = @pubname
            insert #articlechain (pubname, artname) values (@pubname, @artname)
            select @start = @start - 1
          end
        print 'Article Set #' + convert(varchar(32), @sequence)
        print ' '
        select 'Publication Name' = pubname, 'Article Name ' = artname from #articlechain order by id desc
        print ' '
        drop table #articlechain
        fetch #cursor_deepest_articles into @pubname, @artname
        select @sequence = @sequence + 1
      end
    deallocate #cursor_deepest_articles

    drop table #multiplefilters3


    print '*****************************************************************************************************'
    print '* Checking reference to un-published tables in article filters ...'
    print '*****************************************************************************************************'
    print ' '

    declare #csr_filtered_articles cursor for
    select a.pubid, a.artid, a.sync_objid
    from sysmergearticles a
    where a.subset_filterclause <> ''
    and a.subset_filterclause is not null
    or exists
    (select * from sysmergesubsetfilters f
    where a.artid = f.artid and a.pubid = f.pubid)

    create table #filter_article_dependencies (objid int primary key, expanded bit null)

    declare @pubid uniqueidentifier, @artid uniqueidentifier, @sync_objid int

    open #csr_filtered_articles
    fetch #csr_filtered_articles into @pubid, @artid, @sync_objid
    while @@fetch_status <> -1
      begin
        truncate table #filter_article_dependencies
        insert #filter_article_dependencies (objid, expanded) values (@sync_objid, NULL)
        while exists (select * from #filter_article_dependencies where expanded IS NULL)
          begin
            insert #filter_article_dependencies (objid, expanded)
            select distinct d.depid, 0
              from sysdepends d, #filter_article_dependencies f
             where d.id = f.objid
               and f.expanded IS NULL
               and not exists (select * from #filter_article_dependencies f2
                                where f2.objid = d.depid)
            update #filter_article_dependencies set expanded = 1 where expanded IS NULL
            update #filter_article_dependencies set expanded = NULL where expanded = 0
          end
        select f.objid
          into #nonpublishedtables
          from #filter_article_dependencies f, sysobjects o
         where f.objid = o.id
           and o.xtype = 'U'
           and f.objid not in (select objid from sysmergearticles where pubid = @pubid)

        if exists (select * from #nonpublishedtables)
          begin
            select @pubname = name from sysmergepublications where pubid = @pubid
            select @artname = name from sysmergearticles where pubid = @pubid and artid = @artid
            print ' '
            raiserror('Publication [%s] article [%s] references in its subset or join filter(s) ', 0, -1, @pubname, @artname)
            raiserror('the following tables that are not published. This may cause unexpected results at the subscriber.', 0, -1)
            select 'Table name' = object_name(objid) from #nonpublishedtables
          end
        drop table #nonpublishedtables
        fetch #csr_filtered_articles into @pubid, @artid, @sync_objid
      end
    deallocate #csr_filtered_articles
    drop table #filter_article_dependencies

    done:

  • Short paper on SQL Server 2005 transactional replication throughput

    The attached is a short paper by Amy Keenan of Microsoft. Amy did some experiments to study the potential bottleneck of transactional replication. This paper is based on the results of such experiments.

     

    SQL Server Transactional Replication Throughput Study

     

    Amy Keenan

    Microsoft Corporation

    amykeen[at]microsoft.com

     

    May 2006

     

     

    Background

     

    Over the years, we have heard about transactional replication performance issues from customers. Although most performance issues are caused by non-optimal designs of the application, it is also interesting to observe the performance behavior of transactional replication when stress on the system is high. In response to industry demands, SQL Server 2005 provides improvements in the performance and scalability of transactional replication, including: reducing blocking and contention between the Distribution Agent and cleanup jobs; and the option for the Distribution Agent to use multiple parallel streams when applying transactions to the Subscriber (the -SubscriptionStreams parameter of the Distribution Agent).

     

    The goal of this study is to identify possible bottlenecks in the throughput of data in SQL Server transactional replication. We conducted tests to measure throughput under different configurations. We found that with an adequate disk subsystem, transactional replication can sustain high throughput with low latency (latency is the time between when a change is made at a Publisher and when that change is applied at a Subscriber).

     

     

    Schema and Tests

     

    We performed a benchmark test to measure the throughput of transactional replication using a single push subscription.

     

    The initial tests on a slower disk subsystem revealed that we would reach a bottleneck on the hardware before reaching a bottleneck in the replication agents.  Moving to a storage area network (SAN) allowed for the elimination of the hardware bottleneck, and follow-up tests using the new multi-streaming Distribution Agent resulted in performance improvement measured up to 20% for the given schema and number of streams, resulting in a reduction in latency to less than 5 seconds by the end of the test.

     

    The test system had the following characteristics:

     

    ·        The schema used for this test was based on a real-world customer scenario and includes 59 tables with row sizes up to 2 KB, all published in a single publication.

    ·        The SQL Server instances used as the Publisher, Distributor, and Subscriber were the original released version of SQL Server 2005, build 9.00.1399.06.

    ·        The Publisher, Distributor and Subscriber were on separate servers with similar, though not identical hardware. Table 1 lists the hardware configuration of the computers used in the study.

     

    Table 1: Hardware configuration of the computers used in the study

     

    CPU Speed

    RAM

    Physical Processors

    Publisher

    2.0 GHz          

    8 GB

    4

    Subscriber

    2.0 GHz

    8 GB

    4

    Distributor

    2.8 GHz

    32 GB

    4

     

    There were 40 clients continuously updating the data in each of the published tables, with each transaction varying in size from 10 to 100 commands, and a workload consisting of 70% inserts, 20% updates, and 10% deletes.

     

    The Log Reader Agent and Distribution Agent ran continuously while the transactions were applied in the publication database.  We used Windows System Monitor to gather statistics and Replication Monitor to aggregate the results.

     

     

    Test Results

     

    The primary performance indicators measured for this test were:

     

    • Throughput of the Log Reader Agent (as measured by the number of commands per second processed)
    • Throughput of the Distribution Agent (as measured by the number of commands per second processed)
    • Log Reader Agent latency (the number of seconds between the time that a command is applied at the Publisher, and the corresponding record is inserted into the distribution database.)
    • Distribution Agent latency (the number of seconds between the time that a record is inserted into the distribution database, and the corresponding command is applied at the Subscriber.)

     

    Note that commands per second here refers to the number of SQL statements processed per second.

     

    We ran the test in three iterations, each lasting 24 hours:

     

    1)      The servers were using local storage (SCSI disks).

    2)      The servers were attached to a SAN, with a single stream for the Distribution Agent.

    3)      The servers were still on a SAN, with multiple streams for the Distribution Agent.

     

    The SAN is a Xiotech Magnitude 3D SAN with 84 fiber-connected drives, with 100 GB Data LUNs on 10 RAID 1+0 drives, and 40 GB LUNs on 6 RAID 1+0 drives.  Each controller had 2 GB cache, 1 GB used and 1 GB mirrored to the other controller.  Write caching was on for all database LUNs.

     

    In the first iteration, we bottlenecked on the disk before encountering any bottleneck in the replication components, as indicated by the disk queue lengths (Table 3).

     

    Table 2: Maximum and average throughput (in commands per second) of replication agents with servers using local storage

     

    Max

    Avg

    Log Reader

    14,612

    5,961

    Distribution

    14,173

    5,488

     

     

    Table 3: Maximum and average disk queue length on systems using local storage

     

    Max

    Avg

    Publisher

    15

    1

    Distributor

    301

    61

    Subscriber

    282

    66

     

     

    In the second iteration, the SAN eliminated the disk bottleneck.  However, within the first three hours of the test, we began to see increased latency in the Distribution Agent.  The maximum latency for the Distribution Agent was 722 seconds, with an average of 591 seconds.  The Log Reader Agent, with a maximum latency of 120 seconds and an average latency of 1 second, was able to keep up with the transaction load.

     

    Table 4: Maximum and average throughput (in commands per second) of replication agents with servers using a SAN

     

    Max

    Avg

    Log Reader

    24,443

    17,781

    Distribution

    18,508

    15,113

     

     

    Table 5: Maximum and average disk queue length on systems using a SAN

     

    Max

    Avg

    Distributor

    1

    1

    Publisher

    2

    1

    Subscriber

    6

    1

     

     

    From this second iteration, we observed that the Distribution Agent throughput now became the bottleneck.

     

    In the third iteration, we took advantage of the multi-streaming Distribution Agent introduced in SQL Server 2005. Going from one to three streams produced around 20% improvement in the Distribution Agent throughput.  During the 24 hours for the third iteration, the average latency for both of the agents was under 1 second, with the maximum latency being 2 seconds for the Log Reader Agent and 4 seconds for the Distribution Agent.  Both the Log Reader Agent and the Distribution Agent were therefore able to keep up with the workload without incurring latency.

     

    Table 6: Maximum and average throughput (in commands per second) of replication agents with servers using a SAN, and the Distribution Agent using three parallel streams

     

    Max

    Avg

    Log Reader

    25,033

    18,171

    Distribution

    24,921

    18,136

     

     

    Table 7: Maximum and average disk queue length on systems using SAN, and the Distribution Agent using three parallel streams

     

    Max

    Avg

    Distributor

    1