Blog - Title

SQL Server Team Blog

  • Merge Operation in Memory-Optimized Tables

    In the Storage Allocation and Management for Memory-Optimized Tables, we had briefly touched upon the Merge operation. This blog describes Merge operation in detail. We also recommend you to refer to implementing durability for memory-optimized tables for a good understanding of data/delta files and CFPs

     The metadata of all Checkpoint File Pairs (i.e. CFP) that exist in storage is stored in an internal array structure referred to as Storage Array. It is a finitely sized (4096 entries in CTP2 and 8192 entries in RTM) array of CFPs to support a cumulative size of 256GB for durable memory-optimized tables in the database. The entries in the storage array ordered by transaction range. The CFPs in the storage array (along with the tail of the log) represent all the on-disk state required to recover a database with memory-optimized tables.

    Running an OLTP workload overtime can lead to storage array to fill up leading to many inefficiencies as described here

    1. Deleted rows – The deleted rows are not actually removed from the data file but a reference to the deleted row is added to the corresponding delta file. This leads to wastage of storage which in turn impacts the recovery time negatively.
    2. Storage Array Manipulation Overhead - Internal processes search the storage array for operations such as finding the delta file to delete a row. The cost of these operations increases with the number of entries

    To alleviate these inefficiencies, the older closed CFPs are merged based on a merge policy (described below) so the storage array is compacted to represent the same set of data, with reduced number of CFPs and the storage.

    Merge Operation

    The Merge operation takes one or more closed CFPs, called MERGE SOURCE(s), based on an internally defined merge policy, described later, and produces one resultant CFP, called MERGE TARGET. The entries in each delta file of source CFP(s) are used to filter rows from the corresponding data file to remove the data rows that are not needed anymore. The remaining rows in the source CFPs are consolidated into one target CFP.  After the merge is complete, the resultant CFP (i.e. the merge-target) replaces the source CFPs (i.e. the merge sources). The merge-source CFPs go through a transition phase before they are eventually removed from the storage.

    In the example below, the memory-optimized table file group has four data and delta file pairs at timestamp 500 containing data from previous transactions. For example, the rows in the first data file correspond to transactions with timestamp > 100 and <=200 or alternatively represented as (100, 200].  The second and third data files are shown to be less than 50% full after accounting for the rows marked deleted. The merge operation combines these two CFPs and creates a new CFP containing transactions with timestamp > 200 and <=400, which is the combined range of these two CFPs.  You see another CFP with range (500, 600] and non-empty delta file for transaction range (200, 400] shows that Merge operation can be done concurrently with transactional activity including deleting more rows from the source CFPs.

     In-memory OLTP engine, a background thread evaluates all closed CFPs using a merge policy and then initiates one or more merge requests for the qualifying CFPs. These merge requests are processed by the offline checkpoint thread. The evaluation of merge policy is done periodically and also when a checkpoint is closed.

    Merge Policy

    SQL Server 2014 implements the following merge policy

    • A merge is scheduled if 2 or more consecutive CFPs can be consolidated, after accounting for deleted rows, such that the resultant rows can fit into 1 data of ideal size. The ideal size of data  is determined as follows
      • For machines < 16GB. The data file is 16MB and delta file is 1MB (not supported in CTP2)
      • For machines > 16GB, the data file is 128MB and delta file is 8MB

    A maximum of 10 consecutive CFPs can be part of one merge operation.

    • A single CFP can be self-merged if the data file is larger than 256 MB and over half of the rows are marked deleted. A data file can grow larger than 128MB if, for example, a single transaction or multiple concurrent transactions insert/update large amount of data. This will result in the data file to grow beyond its ideal size because a transaction(s) cannot span multiple CFPs.   Note, Self-merge is not supported in CTP2.

    Here are some examples that show the CFPs that will be merged under the merge policy.

    Adjacent CFPs Source Files (% full)

    Merge Selection

    CFP0 (30%), CFP1 (50%), CFP2 (50%), CFP3 (90%)

    (CFP0, CFP1). CFP2 is not chosen as it will make resultant data file > 100% of the ideal size

    CFP0 (30%), CFP1 (20%), CFP2 (50%), CFP3 (10%)

    (CFP0, CFP1, CFP2). Files are chosen starting from left. CTP3 is not combined as it will make resultant data file > 100% of the ideal size

    CFP0 (80%), CFP1 (30%), CFP2 (10%), CFP3 (40%)

    (CFP1, CFP2, CFP3). Files are chosen starting from left. CFP0 is skipped because if combined with CFP1, the resultant data file will be > 100% of the ideal size

     Not all CFPs with available space qualify for merge. For example, if two adjacent CFPs are 60% full, they will not qualify for merge which will result in 40% of wasted storage for these CFPs. In the worst case, all CFPs can be little over 50% full leading to storage utilization of only 50% approximately. It is important to note that deleted row(s) may exist in the storage, but they may have already been removed (i.e. garbage collected) from the memory. The management of storage and the memory is somewhat independent from garbage collection perspective. A thumb rule is that storage taken by ‘Active’ CFPs (note, it does not account for all CFPs) can be up to 2x larger than the size of durable tables in memory.

    You can force merge using stored procedure sys.sp_xtp_merge_checkpoint_files (Transact-SQL) to override the merge policy. For details, please refer to http://msdn.microsoft.com/en-us/library/dn198330(v=sql.120).aspx

    Life Cycle of a CFP

    As mentioned earlier, once the CFPs are merged, they go through a state transition before they can be deallocated. At any given time, the CFPs data/delta file pairs can be in the following states

    • PRECREATED – A small set of CFPs are kept pre-allocated to minimize or eliminate any waits to allocate new files as transactions are being executed. These are full sized with data file size of 128MB and delta file size of 8 MB but contain no data. The number of CFPs is computed as the number of logical processors or schedulers with a minimum  of 8. This is a fixed storage overhead in databases with memory-optimized tables
    • UNDER CONSTRUCTION – Set of CFPs that store newly inserted and possibly deleted data rows since the last checkpoint.
    • ACTIVE - These contain the inserted/deleted rows from previous closed checkpoints. These CFPs contain all required inserted/deleted rows required before applying the active part of the transaction log at the database restart. We expect that size of these CFPs to be approximately 2x of the in-memory size of memory-optimized tables assuming merge operation is keeping up with the transactional workload.
    • MERGE TARGET – CFP stores the consolidated data rows from the CFP(s) that were identified by the merge policy. Once the merge is installed, the MERGE TARGET transitions into ACTIVE state
    • MERGED SOURCE – Once the merge operation is installed, the source CFPs are marked as MERGED SOURCE. Note, the merge policy evaluator may identify multiple merges a CFP can only participate in one merge operation.
    • REQUIRED FOR BACKUP/HA – Once the merge has been installed and the MERGE TARGET CFP is part of durable checkpoint, the merge source CFPs transition into this state. CFPs in this state are needed for operational correctness of the database with memory-optimized table.  For example, to recover from a durable checkpoint to go back in time.  A CFP can be marked for garbage collection once the log truncation point moves beyond its transaction range.
    • IN TRANSITION TO TOMBSTONE – These CFPs are not needed by in-memory OLTP engine can they can be garbage collected. This state indicates that these CFPs are waiting for the background thread to transition them to the next state TOMBSTONE
    • TOMBSTONE – These CFPs are waiting to be garbage collected by the filestream garbage collector.  Please refer to FS Garbage Collection for details

    After accounting for the storage taken by CFPs in various states, the overall storage taken by durable memory-optimized tables can be much larger than 2x. The DMV sys.dm_db_xtp_checkpoint_files can be queried to list all the CFPs in memory-optimized filegroup including the phase they are in. Transitioning CFPs from MERGE_SOURCE state to TOMBSTONE and ultimately getting garbage collected can take up to 5 checkpoints and transaction log backup, if database is configured for full or bulk-logged recovery model, steps. You can, of course, manually force the checkpoint followed by log backup to expedite the garbage collection but then this will add 5 empty CFPs (i.e. 5 data/delta file pairs with data file of size 128MB each). In production scenarios, the automatic checkpoints and log backups taken as part of backup strategy will seamlessly transition CFPs through various states without requiring any manual intervention. The impact of the garbage collection process is that in-memory databases may have a disproportionate storage footprint compared to its size in memory. It is not uncommon to have the size storage occupied by memory-optimized tables to be 3x to their size in memory.

    In the next blog, we will walk through an example showing transition of CFPs across all phases.

  • Hortonworks Makes HDP 2.0 for Windows Server Generally Available

    Today, Hortonworks announced the general availability of the Hortonworks Data Platform (HDP) 2.0 for Windows.  This is the next release of the industry’s only 100% Apache Hadoop-based distribution for Windows. With Windows Server leading all operating system  in usage share (two out of every three servers run Windows Server), HDP 2.0 for Windows dramatically broadens the accessibility of Hadoop.

    This release is important because it is engineered on the latest version of Apache Hadoop 2.2.  It will take advantage of the power of YARN, offer improved  high availability and mean time to recovery, and incorporate phase 2 of the Stinger initiative and Hbase 0.96 to deliver Snapshots. 

    As part of our overall big data approach, we are deeply invested in ensuring customers have choice and interoperability when building and running Hadoop-based solutions. This includes providing customers the choice to deploy Hadoop data in the cloud on Windows Azure. Applications built on HDP for Windows on-prem can be easily deployed in the cloud for platform flexibility. Additionally, customers can leverage Power BI for Office 365 to connect to either HDP 2.0 for Windows or Windows Azure HDInsight to empower Microsoft Excel users with compelling insights on big data.

    We’re partnering closely with Hortonworks to bring the community new Hadoop solutions for Windows. HDP for Windows continues our joint commitment to opening up the Hadoop platform to as many people as possible.

    You can find out more about today’s announcement by:

    • Read the Hortonwork’s blog
    • Read about Microsoft and Hortonworks involvement with the open source community 
    • Join us for a Microsoft and Hortonworks Webinar on February 6, 2014
    • Try Power BI on HDP for Windows or Windows Azure HDInsight
    • Join us for a #bigdata hackathon in February prior to the @strataconf

    Herain Oberoi
    Director of Product Marketing

  • Storage Allocation and Management for Memory-Optimized Tables

    As described in implementing durability for memory optimized tables, the storage requirements and its management for memory optimized tables is very different compared to the disk-based tables. If you are migrating a subset of your disk-based tables to memory-optimized tables or are just developing a new application with memory-optimized tables, you will notice these differences. Unlike disk-based tables in SQL Server that use 8k pages to store data in the same format both on disk and in memory, the  memory-optimized tables utilize a different size and  format both in-memory and on disk. This has been a point of concern by many customers as they have observed, sometimes, disproportionate storage consumed by memory optimized tables. The goal of this blog is to help alleviate this confusion by describing how storage is managed over the lifetime of data rows. 

    The data for memory optimized tables is stored in one or more data/delta file pairs (also referred to as checkpoint file pairs or CFP) with data file(s) storing inserted rows and delta file(s) referencing deleted rows.  During the execution of an OLTP workload, as the DML operations update, insert, and delete rows, new data/delta files are created to persist the data. Also, existing delta files are updated to process delete of existing rows.  Over time, if the number of active rows, after accounting for deleted rows, in two or more consecutive CFPs falls below a threshold (usually < 50%) such that they can be merged into one CFP of 128 MB, they are merged automatically by a background merge process into a new CFP.  Once the merge operation is complete, the older CFPs go through a transition phase and are eventually removed (i.e. garbage collected) from the storage.  Note, SQL Server 2014 CTP2 supports up to 4096 CFPs within a database but this limit will be changed to 8192 CFPs in RTM

    At any given time, the data/delta file pairs are in one of the following 4 categories

    1. Pre-allocated CFPs (data and delta): A small set of CFPs are kept pre-allocated to minimize or eliminate any waits to allocate new files as transactions are being executed. These are full sized with data file size of 128MB and delta file size of 8MB but contain no data. The number of CFPs is computed as the number of logical processors or schedulers with a minimum of 8. This is a fixed storage overhead in databases with memory-optimized tables
    2. Active CFPs: These contain the inserted/deleted rows for the last ‘durable’ checkpoint. These CFPs contain all inserted/deleted rows required before applying the active part of the transaction log at the database restart. We expect that the combined size of the ACTIVE CFPs to be at most 2x of the in-memory size of memory-optimized tables. However, there are cases, specifically with data files > 128MB which can happen due to large concurrent transactions or if the merge operation falls behind, when we may exceed this 2x limit. To simplify the discussion here, let us just assume all data files are 128MB and that the merge/checkpoint operations are keeping up.
    3. CFPs required for Operational Correctness: These represents files that were a source of MERGE operations where rows from these files were moved to a new data/delta files. These files (i.e. CFPs) are not needed for database restart.  However, these CFPs can’t be de-allocated or marked for deallocation (i.e.  Garbage collection) because they are needed if SQL Server needs to recover from a previous durable checkpoint.  A CFP can be marked for garbage collection once the log truncation point moves beyond its transaction range. This requires a combination of checkpoint operation and transaction log backups. Transaction log backups are not needed if the database has been configured in simple recovery model.
    4. CFPs that can be removed from storage: These are garbage collected similar to how file-stream files are garbage collected.  Please refer to FS Garbage Collection for details

    Transitioning CFPs out of category-3 and category-4 can take up to 5 checkpoints and transaction log backup steps, if not running in simple recovery mode. For in-memory databases, the automatic checkpoint is taken when the size of transaction log exceeds 512MB since the last checkpoint.  You can, of course, manually force the checkpoint followed by log backup to expedite the garbage collection but then this will add 5 empty CFPs (i.e. 5 data/delta file pairs with data file of size 128MB each). Typically, in test environment, you may have to force transaction log backups and checkpoints to remove the CFPs that are not needed. In production scenarios, however, we don’t expect customers to do manual checkpoint but to rely on the automatic checkpoints and log backups that are taken as part of backup strategy. The impact of this garbage collection process is that in-memory databases may have a disproportionate storage footprint compared to its size in memory.

    To clarify the point further, let us walk through some examples of in-memory databases under various scenarios and compare their size in-memory and on storage.  For these examples, we will assume we are running a SQL instance with 8 logical processor with 1 memory optimized table with following schema. Note, this has a row size of approximately 8KB. 

    CREATE TABLE dbo.t_memopt (

           c1 int NOT NULL,

           c2 char(40) NOT NULL,

           c3 char(8000) NOT NULL,

     

           CONSTRAINT [pk_t_memopt_c1] PRIMARY KEY NONCLUSTERED HASH (c1) WITH

           (BUCKET_COUNT = 100000)

    ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)

    As indicated earlier, the fixed storage overhead for this configuration will be 1 GB (i.e. 8 data files of 128MB each).

    • Example-1:
      • DML Operation: Insert 1 row
      • In-memory Size - you can get the size by querying sys.dm_db_xtp_table_memory_stats. The data row is taking 8KB while hash index is taking 1MB for allocating 100K buckets

    Name       memory_used_by_indexes_kb memory_used_by_table_kb

    ---------- ------------------------- -----------------------

    t1         1024                      8

      • Storage for memory-optimized table – The storage is more interesting. On the machine with 8 logical processors, there are 8 data and 8 delta files pre-allocated and there is 1 data/delta file pair used to store the currently active rows. So overall storage for memory optimized FG is (9*128MB) = 1.12 GB which is disproportionately larger than the size of in-memory table. This is expected because my configuration has a fixed overhead of 1 GB plus the granularity of storage allocation is 128MB. This is different than disk-based tables where unit of storage allocation is an 8K page for tables < 64k in size. This additional size of storage for memory-optimized tables may surprise you if you are just playing with small data sets. However, we don’t expect this to be a cause of concern as typical customer workload will have much larger dataset (e.g. 10s or 100s of GB) stored in memory-optimized tables.
    • Example-2
      • Operation: Insert 1-million rows
      • In-memory Size – 7.8GB
      • Storage for memory-optimized table – 8.9GB which is pretty reasonable compared to previous example. 1 million rows should take approx. 8GB of storage plus 1GB of fixed overhead which is now around 10%.
      • Storage for disk-based table – it is around 8GB so it is of the same order as memory optimized tables.
    • Example-3
      • Operation: update non-key values for the first 500K rows of the table loaded in the previous example.
      • In-memory Size (used) – 8 GB. When I measured this, most rows had been garbage collected
      • Storage for memory-optimized table – 13 GB. You may wonder why the size of storage has gone up for update operation.  An update operation on memory-optimized table is done as delete followed by an insert.  For the deleted rows, the delta file(s) are updated to reflect them and then the new row versions are inserted into another data file. So at this time, both old and new version of the data row are persisted and that is why you see that the storage consumption has gone up approximately 50%. The data/delta files will get merged eventually and get garbage collected as described earlier.
      • Storage for disk-based tables – it stays around 8GB. This is expected because the update of the non-key columns is done in-place for disk-based tables.
    • Example-4
      • Workload/Operations – Update only OLTP workload running against a database of size 150GB for over 24 hours to measure storage and memory in the steady state
      • In-memory size (used) – 167 GB. This shows that the stale row versions are getting garbage collected. Please refer to Memory Management for Memory-Optimized Tables for details.
      • Storage for memory-optimized table –320 GB. Given that the size of the database is 150GB, the storage in steady state is around 2.2 times.

    Though these examples were over simplified but they do illustrate the difference in storage provisioning for memory optimized tables over disk-based tables. Your storage requirement will depend upon following key factors such

      • size of your durable memory optimized tables
      • DML operations on the durable tables. For example, if your workload is update heavy, it leads to MERGEs thereby inflating the storage requirement until these files get garbage collected
      • Frequency of checkpoint and log backups.
  • SQL Server 2014 In Memory OLTP: Memory-Optimized Table Types and Table Variables

    SQL Server 2014 In-Memory OLTP introduces two new widely advertised kinds of database objects: memory-optimized tables for efficient data access, and natively compiled stored procedures for efficient query processing and business logic execution. However, there is a third kind of object introduced by In-Memory OLTP: the memory-optimized table type.

    The use of memory-optimized table variables has a number of advantages over traditional table variables:

    • The variables are truly in memory: they are guaranteed to never spill to disk.
    • Data access is more efficient due to the same memory-optimized algorithm and data structures used for memory-optimized tables, especially when the variables are used in natively compiled stored procedures.
    • Finally, with memory-optimized table variables there is no tempdb utilization: table variables are not in tempdb, and do not use any resources in tempdb.

    The typical usage scenarios for memory-optimized table variables are:

    • Storing intermediate results and creating single result sets based on multiple queries in natively compiled stored procedures.
    • Passing table-valued parameters into natively compiled stored procedures, as well as traditional interpreted procedures.
    • As replacement for traditional table variables, and in some cases for #temp tables that are local to a stored procedure.  This is particularly useful if there is a lot of tempdb contention in the system.

    Like memory-optimized tables, SQL Server generates a DLL for each memory-optimized table type. As is the case for memory-optimized tables, the DLL includes the functions for access indexes and retrieving data from the table variables. When a memory-optimized table variable is declared based on the table type, an instance of the table and index structures corresponding to the table type is created in the user session. The table variable can then be used in the same way as traditional table variables: you can insert/update/delete rows in the table variable, and you can use the variables in T-SQL queries. You can also pass them into natively compiled as well as traditional stored procedures, as table-valued parameters (TVP).

    The following sample shows a memory-optimized table type from the AdventureWorks-based In-Memory OLTP sample.

    CREATE TYPE [Sales].[SalesOrderDetailType_inmem] AS TABLE(

          [OrderQty] [smallint] NOT NULL,

          [ProductID] [int] NOT NULL,

          [SpecialOfferID] [int] NOT NULL,

          [LocalID] [int] NOT NULL,

          INDEX [IX_ProductID] HASH ([ProductID])

                WITH ( BUCKET_COUNT = 8),

          INDEX [IX_SpecialOfferID] HASH ([SpecialOfferID])

                WITH ( BUCKET_COUNT = 8)

    )

    WITH ( MEMORY_OPTIMIZED = ON )

    As you can see, the syntax is very similar to traditional table variables. A few things to keep in mind:

    • The option MEMORY_OPTIMIZED=ON indicates that the table type is indeed memory-optimized.
    • The type must have at least one index. Like with tables, supported index types are hash and nonclustered.
      • Keep in mind that for a hash index the bucket_count should be about 1-2X the number of expected unique index keys. Over-sizing is usually better than under-sizing: if sometimes you insert only 2 values in the variables, but sometimes insert up to 1000 values, it’s usually better to specify BUCKET_COUNT=1000.
    • The restrictions on memory-optimized tables related to data types and constraints also apply to memory-optimized table types. For example, in SQL Server 2014 default constraints are supported, but check constraints are not.
    • Like memory-optimized tables, table variables do not support parallel plans.
    • Like memory-optimized tables, table variables must fit completely in memory – there is no spill to disk.
    • Traditional table variables exist in tempdb. In contrast, memory-optimized table variables exist in the user database.
    • Memory-optimized table types need to be specified explicitly using a CREATE TYPE statement. Specification in-line with the variable declaration is not supported.

    Table-Valued Parameters

    The following example script shows declaration of a table variable as the memory-optimized table type Sales.SalesOrderDetailType_inmem, insert of three rows into the variable, and passing the variable as a TVP into Sales.usp_InsertSalesOrder_inmem.

    DECLARE @od Sales.SalesOrderDetailType_inmem,

          @SalesOrderID uniqueidentifier,

          @DueDate datetime2 = SYSDATETIME()

     INSERT @od (LocalID, ProductID, OrderQty, SpecialOfferID) VALUES

          (1,888,2,1),

          (2,450,13,1),

          (3,841,1,1)   

    EXEC Sales.usp_InsertSalesOrder_inmem

          @SalesOrderID = @SalesOrderID,

          @DueDate = @DueDate,

          @OnlineOrderFlag = 1,

          @SalesOrderDetails = @od

    Note that memory-optimized table types can be used as the type for stored procedure table-valued parameters (TVPs), and can be referenced by clients in exactly the same way as traditional table types and TVPs. Therefore, invocation of stored procedures with memory-optimized TVPs, and indeed natively compiled stored procedure, works exactly the same way as the invocation of traditional stored procedures with traditional TVPs.

    Cross-Database Queries

    In SQL Server 2014, memory-optimized tables do not support cross-database transactions. This means you cannot access another database from the same transaction or the same query that also accesses a memory-optimized table. This means you cannot straightforwardly copy data from a table in one database, to a memory-optimized table in another database.

    Table variables are not transactional. Therefore, memory-optimized table variables can be used in cross-database queries, and can thus facilitate moving data from one database into memory-optimized tables in another. The idea is to use two transactions: in the first transaction you insert the data from the remote table into the variable; in the second transaction you insert the data into the local memory-optimized table from the variable.

    For example, if you want to copy the row from table t1 in database db1 to table t2 in db2, using variable @v1 of type dbo.tt1, you would use something like:

    USE db2

    GO

    DECLARE @v1 dbo.tt1

    INSERT @v1 SELECT * FROM db1.dbo.t1

    INSERT dbo.t2 SELECT * FROM @v1

    GO

  • SQL Server 2014 Hybrid Cloud Scenarios: Migrating On-Premises SQL Server to Windows Azure Virtual Machines

    Those of you who have tried the new version of SQL Server Management Studio might have noticed already that it has a couple of new wizards added. One of those wizards is sitting next to the already existing that has been allowing you to deploy to a Windows Azure SQL Database service (former SQL Azure) and it is called Deploy Database to a Windows Azure Virtual Machine (VM).

    You might be wondering why this wizard is here and what it can do. The blog post below tries to provide the answer to this and some of the related questions.

    The Path to Windows Azure

    We have seen many time that when people are trying a Virtual Machine in the Windows Azure environment they quickly come up with a number of questions. Once the first set of questions is resolved the next wave of questions come up, then the next wave, and a few next waves.

    I have tried to compile the list of things you could expect.

    1. Get access to a Windows Azure.
      Logging into a Windows Azure Portal might not be enough to unlock the full power of the Azure. In majority of cases you would need a Management Certificate. There is nothing special about it, except it should be available on the machine you are using, and it should be known to the Windows Azure. This means you should create a certificate somehow. Alternate option is to download Publishing Profile from the Portal. In this case Portal will generate certificate on your behalf, add it to the subscriptions you have access to, and share it with you in the form of XML file.
    2. Create and configure your VM.
      When you have an access to the Azure environment you can start creating other pieces you are needed. The first thing to create is a VM. Don’t forget that you need a VM that:
      1. Has a SQL Server in it and version of the SQL Server is the same or higher as the version of SQL Server you have currently, otherwise your deployment capabilities will be very limited.
      2. Has an Azure Drive attached, you will need it to store your data.
      3. In case VM was created in a Cloud Service it should have endpoints configured properly, otherwise please make sure you have Azure VNet and corresponding VPN.
      4. You should be able to connect to the SQL Server instance meaning connections should be accepted by the SQL Server and customer should be able to authenticate. In case your machine is not joined to domain this means only SQL authentication is available, so don’t forget to set up a SQL login in advance to the instance configuration.
      5. Windows firewall should allow connections to the SQL server to allow them to go though.
      6. Plus many other steps needs to be taken.
      7. Once your target system is set up you need to focus on copying data from the source system. The best option that gives you the most of the recovery capabilities it to use the Azure Storage as an intermediate location. In this case you either copy your files directly using Backup to URL, or backup them locally and then use Azure SDK to upload them to the storage.
      8. Once data is in the cloud you need to get it to the VM. Depends on the data size you can either pull it to the VM and restore or just use Restore from URL.

    Those steps are describing the things need to happen to end up with your workload in Windows Azure VM. There are multiple small details in this process, same as a lot of things to consider. You can consult this article if you need more information on the subject.

    The Wizard

    You might have question how the process above is related to the wizard described in this post. This is exactly what Deploy Database to a Windows Azure VM wizard does for you, plus a few additional convenience items.

    Let’s see how this looks in the wizard.

    The first screen tries to provide you an idea of what information might be useful for you during the use of the wizard.

     

    The Source Settings screen only want you to provide with two pieces of information:

    • Connection to the source server and database
    • Where to place backup files (file or UNC path). Please note: this path should look the same for the wizard and the Database Engine

     

    Once you told the wizard where to get the data from you might want to connect to Windows Azure. There are 3 different options:

    • Manually provide with the management certificate and paste or type a subscription ID
    • Import a publishing profile if you have downloaded it already
    • Use you Microsoft Account (former Windows Live ID) to sign in to the portal and allow the Wizard to retrieve publishing profile behind the scenes for you

     

    Once all authentication information is in place we connect to a Windows Azure environment, so you can start configuring it. There are a few things you could do:

    • Type a new Cloud Service name (or select it from the list if you already have one)
    • Type a new VM name (same as before, you can select it from the list if there are some to select from)
    • Select the storage account which is in the same region with the VM (if you don’t have any, don’t worry – wizard will create an account for you)
    • Press the Settings… button (One of the most important steps). This button behave differently depending on whether this is a new VM or an existing VM, but in all cases the button needs to be pressed to enter the information that is important for the process.

     

    When you type a new VM name and press the Settings… button a new dialog comes up. If you have seen Windows Azure Portal before you should be familiar with most fields in this new dialog.

     

    Please note: we are trying to warn you if we think that the target SQL Server instance might be lower version than the source instance. Sometimes those are hard errors that doesn’t allow you to continue, when we know for sure that the final configuration will not be compatible, while sometimes we can only guess. In either case the suggestion is to do not ignore those warnings:

     

    When you select the VM that has existed before this makes the dialog look differently.

    Please note two things here:

    • This is one of the first times SQL Server Cloud Adapter comes up on the screen. This is a new component that runs as a service inside a VM (you can run it on the physical machine too) and does all the magic of the VM configuration. We will spend a bit of time on its capabilities and restrictions a little later.
    • Since single Cloud Service can contain multiple VMs sharing the same public IP – this means Cloud Adapter Port might be different for those VMs. Also as the previous phrase suggests this is a public port.

    Once you chose settings for a new VM or connected to an existing one the Target Database section became active. Please note that Database name field is editable meaning you can change your mind about database name on the target machine.

     

    Now you are just two easy steps away from having your database in the cloud. The first step is to confirm the selection you have made when following the wizard:

     

    And a second step is to wait until process completes. Depends on the database size and your connection speed the time might differ, but wizard will make sure it is done.

     Deploy Database to a Windows Azure VM Deployment Progress

    And finally you would see the confirmation screen that also contains a link to the detailed log.

     

    This log is important for most of the troubleshooting actions, same as for digging for details.

    The Result

    Deploy Database to a Windows Azure VM wizard has guided you through the deployment process. Now this is the right time to observe what the end result is look like.

    1. You have Virtual Machine running in Windows Azure
    2. This VM has SQL Server instance configured
    3. You database is deployed to the instance

    However, a few manual steps might be needed.

    1. You need to add a logins and give those logins access to the database the wizard has deployed.
    2. If your application would be connecting to the VM from outside – make sure to configure the SQL Server to allow the connectivity.

    Finally

    The Deploy Database to a Windows Azure VM wizard is in place to simplify the steps needed to have a database in a Windows Azure VM environment. The whole wizard is made around the scenarios when you need your database in the cloud as soon as possible and willing to skip some studying or preparation steps as a tradeoff for the speed of the deployment. The other assumption was that you don’t want to do a deployment manually and prefer SQL Server to do some of this work for you.

    The wizard was made as an extremely safe to your data. There is no chance it could delete, overwrite, or even modify your data. All operations it does are read only for the data.

    We really hope this new feature will help you to save time during your deployments.

  • Boosting Transaction Performance in Windows Azure Virtual Machines with In-Memory OLTP

    With the release of SQL Server 2014 CTP2, you can now significantly boost the performance of your OLTP workloads in Windows Azure Virtual Machines. By creating a new VM with our preloaded image of SQL Server 2014 CTP2 on Windows Server 2012 R2, or installing SQL Server 2014 CTP2 on your VM, In-Memory OLTP functionalities are immediately available to you. This blog post provides a good guide on how to create a Windows Azure VM.

    However, since the transition to In-Memory OLTP is not as simple as flipping a switch, you must carefully evaluate your application scenario and see if it is the right solution for you.

    Recommended Scenarios

    For SQL Server 2014 CTP 2, we recommend the following scenarios for SQL Server In-Memory OLTP on a Windows Azure Virtual Machine:

    • Development and Test scenarios, e.g. familiarizing with In-Memory OLTP’s capabilities. The on-demand provisioning of Windows Azure VMs and its low cost make it easy to gain full access to In-Memory OLTP functionalities without a large capital investment. Once you are comfortable with its capabilities and understand its limitations, you can move to deploy In-Memory OLTP on your local server or keep using the VM if it suits your business needs.
    • Scenarios with relaxed data persistence requirements, e.g. web browser session state. SQL Server In-Memory OLTP provides non-durable tables that are perfect for transient data regardless of the transaction read/write mix. By completely eliminating I/O operations, non-durable tables could provide amazing boosts to performance for your non-persistent workload. If some persistence is still required but strict durability is not, you can leverage the new Delayed Durability features we have added to CTP2. Delayed Durability commits transactions but do not immediately write the log records to disk, lightening log I/O pressure by allowing larger and less frequent log writes to disk. For more details see the Books Online topic here.
    • Read-mostly and read-only scenarios, e.g. an online product catalogue. In-Memory OLTP provides extreme performance and parallelism in read-mostly and read-only scenarios due to new algorithms and data structures optimized for memory-resident data. Furthermore, native compilation for stored procedures can dramatically increase CPU efficiency and throughput.
    • Durable read-write scenarios not under log I/O pressure, e.g. workloads with heavy contention. In-Memory OLTP can also provide benefits for workloads with full durability requirements that is not under pressure from log I/O latency or throughput. Because In-Memory OLTP eliminates page contention by using an optimistic concurrency system, it could provide significant boosts to parallelism for your workload if it is suffering from contention problems. In addition, native compilation can improve the speed of a stored procedure with heavy business logic processing.

    However, if your workload is suffering from long log I/O latency or if it is under pressure from log I/O throughput, and at the same time you require strict durability of your data, In-Memory OLTP on Windows Azure VM will not alleviate these problems.

    SQL Server 2014 CTP2 on Windows Azure VM is not suitable for testing the performance of applications deployed in machines with similar configuration on premise.

    Selecting the Right Windows Azure VM Size

    The selection of VM sizes is important for any workload running in the IaaS space, whether you are provisioning a new VM using our preloaded CTP2 image or adding CTP2 onto an existing instance. Windows Azure provides a selection of virtual machine sizes for a variety of purposes, as listed in this article.

    Since In-Memory OLTP is designed to serve extreme high session concurrency and the nature of memory-resident data requires sufficient memory space, we recommend the following Windows Azure VM sizes for adopting In-Memory OLTP:

    Compute Instance Name

    Virtual Cores

    RAM

    Extra Large (A4)

    8

    14 GB

    A6

    4

    28 GB

    A7

    8

    56 GB

    The exact instance size that you will choose will depend on the scenario you wish to run and the size of data you wish to make memory-resident in SQL Server In-Memory OLTP. We recommend provisioning 100% more memory than the estimated size of data in memory and leave sufficient space for the buffer pool used by disk-based workloads on the same machine. This article on SQL Server Books Online has more information on how to estimate the size of a table in memory, and we have published a blog post on hardware considerations for In-Memory OLTP, some of which apply to provisioning VMs as well.

    Virtual Machine Configuration

    To configure a Windows Azure Virtual Machine for best performance in In-Memory OLTP, we suggest that you should follow the best practices outlined in this whitepaper. A summary of key considerations, plus some unique attributes for In-Memory OLTP, are listed below:

    • Use a single storage account for all disks attached to the same VM.
    • Do not turn on disk host caching for the data and log drives.
      When creating new drives for a Windows Azure VM, the disk host caching option is turned off by default. We do not recommend you change this option.
    • Do not use the Temporary Disk (D: Drive) for log storage.
      This will sacrifice durability for your database, as the content on the Temporary Disk is transient and may be lost on a VM failure or reboot. Please use a persistent Windows Azure Storage disk for your log storage needs.
    • Do not use native operating system striping for storing the log of an In-Memory OLTP-enabled database.
      Logging memory-optimized tables is latency-sensitive, and striping of drives increase the latency of I/O operations. If you wish to use any form of striping for your log drive, you should test your configuration to ensure that the I/O latency is sufficiently low.
    • Compression is not supported for In-Memory OLTP tables. Compression will still work with regular tables in an In-Memory OLTP-enabled database.
  • Two SQL Server Jump Starts, Live on December 3rd and 4th

    If you haven’t yet, be sure to check out the SQL Server 2014 CTP2?  Essentially feature complete, this second public community technical preview enables you to try and test all of the capabilities of the full SQL Server 2014 release. To help get you trained on the new capabilities, Microsoft Virtual Academy (MVA) has two lively, demo filled events planned. First on December 3rd, Mission Critical Performance with SQL Server 2014 will show how SQL Server 2014 will help you improve performance speed by 10-30 times, using your current hardware. And you’ll improve reliability at the same time. Then on December 4th, we’ll cover the benefits of moving your organization to the cloud, and how that can provide cost benefits at the same time as increasing scale and flexibility. Microsoft experts will demonstrate how you can make that move one step at a time, using SQL Server 2014 to create a hybrid environment.

    Mission Critical Performance with SQL Server 2014

    Date:  December 3, 2013
    Time: 9:00am – 5:00pm
    Where: Live, online virtual classroom
    Cost: Free!

    Register now

    Platform for Hybrid Cloud with SQL Server 2014

    Date:  December 4, 2013
    Time: 9:00am – 5:00pm
    Where: Live, online virtual classroom
    Cost: Free!

    Register now

     

  • The Modern Data Warehouse

    In recent surveys by TDWI Research, roughly half of respondents report that they will replace their primary data warehouse (DW) platform and/or analytic tools within three years. Ripping out and replacing a DW or analytics platform is expensive for IT budgets and intrusive for business users. This raises the question: What circumstances would lead so many people down such a dramatic path?

    It’s because many organizations need a more modern DW platform to address a number of new and future business and technology requirements. Most of the new requirements relate to big data and advanced analytics, so the data warehouse of the future must support these in multiple ways. Hence, a leading goal of the modern data warehouse is to enable more and bigger data management solutions and analytic applications, which in turn helps the organization automate more business processes, operate closer to real time, and through analytics learn valuable new facts about business operations, customers, products, and so on.

    For organizations that need a modern data warehouse that satisfies new and future requirements, TDWI offers a checklist of our top six recommendations. These can guide your selection of vendor products and your solution design.

    Users facing new and future requirements for big data, analytics, and real-time operation need to start planning today for the data warehouse of the future. To help them prepare, this TDWI Checklist Report drills into each of the six recommendations, listing and discussing many of the new vendor product types, functionality, and user best practices that will be common in the near future, plus the business case and technology strengths of each.

    This checklist was sponsored by Microsoft.

    To read the full report, sign up and download the report here

    *This post excerpted from the TDWI Checklist Report, November 2013
  • Evolving your SQL Server Data Warehouse to the Next Generation, High Performance Solution, the SQL Server Parallel Data Warehouse (PDW)

    Last week, we highlighted a whitepaper that focused on the performance benefits of SQL Server PDW and how it differs from traditional SQL Server. We’ve seen many SQL Server customers evolve to this latest PDW as their next generation platform for their data warehouse infrastructure for many of the reasons noted in that whitepaper and this week, we want to follow-up on that by sharing our guide for how to migrate from a traditional data warehouse to PDW.

    This whitepaper is based of real world best practices and will showcase how you can leverage your existing knowledge and skills around SQL Server to deploy SQL Server PDW as your next generation solution for your data warehouse. Download the paper to find out more. 

  • How Does SQL Server Parallel Data Warehouse (PDW) Deliver the Performance that it Does?

    Last week, we introduced you to SQL Server PDW, the version of SQL Server built specifically for high performance data warehousing that delivers performance gains of up to 50x compared to traditional data warehouses. The next logical question we often get is “how is this possible?” Is it just SQL Server running on special hardware? And the answer is yes…but there is a lot more to it than that.

    PDW has a shared nothing, Massively Parallel Processing (MPP) architecture allows for better performance when loading and querying data simultaneously as well as performing the types of complex querying that is common amongst today’s analytically driven enterprises. The MPP architecture is also what enables PDW’s seamless linear scalability from terabytes to 6 petabytes of data. This is a fundamentally different architecture than traditional, Symmetric Multi-Processing (SMP) that traditional SQL Server is based on and this MPP architecture is really at the heart of what’s different about PDW and how it’s able to deliver the ground breaking performance that it does.

    To learn more about how PDW works and some of the performance gains we’ve seen in typical data warehouse scenarios, we encourage you to download this whitepaper today!

     Massively Parallel Processing Engine

  • SQL Server 2014 In-Memory OLTP: Memory Management for Memory-Optimized Tables

    Memory-optimized tables must fully reside in memory and can’t be paged out. Unlike disk-based tables where insufficient memory can slowdown an application, the impact to  memory-optimized tables upon encountering out-of-memory can be severe, causing  DML (i.e. delete, insert or update) operations to fail. While this adds a new dimension to managing memory, the application failure due to resource errors is not something new. For example, applications using disk-based tables can fail with resource errors such as running out of transaction log or TempDB or out of storage. It is the responsibility of DBAs/Administrators to make sure resources are provisioned and managed appropriately to avoid such failures. SQL Server provides a rich set of monitoring tools, including DMVs, PerfMon and XEvents to help administrators identify problems earlier so that a corrective action can be taken. Similarly, for memory-optimized tables, SQL Server provides a rich set of monitoring capabilities and configuration options so that you can manage your database/instance well and keep your application running smoothly.  The remainder of this blog walks thru each of the challenges and details how it can be addressed.

    How do I estimate the memory needed?

    This is the first question that you need consider when migrating an existing table(s) to memory-optimized table(s) or when you are considering a new application using memory-optimized tables. When migrating a disk-based table, you know, for example using sp_spaceused as described in http://technet.microsoft.com/en-us/library/ms188776.aspx ,  its current size so it is just a simple mathematical calculation to find the corresponding size for memory-optimized tables.  The key differences to be aware of are that memory-optimized tables cannot compressed like disk-based tables with ROW and PAGE compression, so, the memory-optimized table will likely be bigger. However, unlike indexes for disk-based tables, the indexes on memory tables are much smaller. For example, the index key is not stored with hash indexes and all indexes are, by definition, covered indexes. Please refer to http://msdn.microsoft.com/en-us/library/dn247639(v=sql.120).aspx for details. A more challenging task is to estimate the data growth.  While you can make a reasonable guess, the best way is the continuous to monitor the table size and the memory consumed by memory-optimized table (s) in your database and instance.  The same monitoring approach holds for new applications that are created with in-memory OLTP in mind.

    How does SQL Server manage memory for memory-optimized tables?

    The in-memory OLTP engine is integrated with SQL Server. Memory allocations to memory-optimized tables are managed by SQL Server Memory Manager and the allocated memory is tracked using familiar constructs and tools such as memory clerks and DMVs. The following DMV shows XTP memory clerks. The first row shows the memory allocated by system threads. The second row with name DB_ID_5 represents the consumers in the database objects and the third row with memory node-id 64 represents memory allocated to DAC (Dedicated Admin Connection).

    -- Show the memory allocated to in-mempory OLTP objects

    select type, name, memory_node_id, pages_kb/1024 as pages_MB

    from sys.dm_os_memory_clerks where type like '%xtp%'

    type                 name       memory_node_id pages_MB

    -------------------- ---------- -------------- --------------------

    MEMORYCLERK_XTP      Default    0              18

    MEMORYCLERK_XTP      DB_ID_5    0              1358

    MEMORYCLERK_XTP      Default    64             0

    Also, there are new DMVs that can be used to monitor the memory consumed by the in-memory OLTP engine and memory-optimized tables. Please refer to http://msdn.microsoft.com/en-us/library/dn133203(v=sql.120).aspx for details.

    Like any other memory consumer, the in-memory OLTP engine responds to memory-pressure, but to a limited degree. For example, the memory consumed by data and indexes can’t be released even under memory pressure. This is different than disk-based tables where an external memory pressure may cause the buffer pool to shrink which simply means there will be fewer data/index pages in memory. For this reason, it is all the more important to provision the memory for memory-optimized tables appropriately, otherwise in-memory OLTP engine  can starve other memory consumers including the memory needed by SQL Server for its operations which can ultimately leads to slow or unresponsive application. To address this, SQL provides a configuration option to limit the memory consumed by memory-optimized tables.

    How can I limit memory consumed by memory-optimized tables?

    Starting with SQL Server 2014, you can bind a database to a Resource Pool. This binding is only relevant when the database has one or more memory-optimized table. The memory available in the resource pool controls the total memory available to memory-optimized tables in the database.

    For example, create a resource pool, mem_optpool as follows

    CREATE RESOURCE POOL mem_optpool WITH (MAX_MEMORY_PERCENT = 40);

    Now map the database, mydatabase, to this resource pool by executing the following command. With this command, you are specifying that the total memory taken by memory-optimized tables and indexes cannot exceed the limit in the resource pool. So for this case, the other 60% memory is available to other consumers.

    EXEC sp_xtp_bind_db_resource_pool 'mydatabase', 'mem_optpool'

    When configuring memory for memory-optimized tables, the capacity planning should be done based on MIN_MEMORY_PERCENT, not on MAX_MEMORY_PERCENT. This provides more predictable memory availability for memory-optimized tables as pools that have the min_memory_percent option set can cause memory pressure notifications against other pools to ensure the minimum percentage is honored.. To ensure that memory is available for the In-Memory OLTP database and help avoid OOM (Out of Memory) conditions, the values for MAX_MEMORY_PERCENT and MIN_MEMORY_PERCENT should be the same. SQL Server target memory is dynamic relative to the OS and setting a minimum memory would be recommended only if the server is not dedicated. For details, please refer to http://msdn.microsoft.com/en-us/library/dn465873(v=sql.120).aspx.

    How does SQL Server reclaim memory taken by deleted/updated rows

    The rows for memory-optimized tables are stored in-memory and are linked through Hash and non-clustered indexes as described http://msdn.microsoft.com/en-us/library/dn133190(v=sql.120).aspx. Concurrent access to memory-optimized table uses optimistic concurrency control  based on row versions.   Over time, the existing rows may get updated (update operation generates a row version(s)) and deleted but these rows can’t immediately be removed as there may be concurrent transactions that need these rows versions. These older row versions are garbage collected (GC’d) asynchronously when it is determined, based on the active transactions, that they are no longer needed. There is a GC system thread that shares the row version cleanup (i.e. GC) with user transaction activity to ensure that SQL Server is able to keep up with the GC.   When you configure the memory for your workload, you must account for additional memory needed for stale row versions. You can roughly estimate the memory needed for stale row versions using the following steps:

    1. Find the average number of rows updates/deleted by querying DMV dm_db_xtp_object_stats  at the beginning and end of a desired duration for your peak workload, and then compu
    2. Estimate the number (NR) of row versions = (duration of the longest transaction accessing HK tables in seconds) * (average number rows updated/deleted/sec)
    3. Assuming that delete/update operations were uniformly distributed across all memory-optimized tables, find the average row length RLen (including row header and index pointers).  You can now estimate the memory for row versions as (NR * Nlen)

    You can use DMVs and Perfmon counters to monitor the progress of Garbage collection. Please refer to http://msdn.microsoft.com/en-us/library/dn133203(v=sql.120).aspx.

  • Microsoft SQL Server Data Tools Update

    With the release of SQL Server 2014 Community Technology Preview 2 (CTP2) and the commercial release of Visual Studio 2013, we are excited to announce new preview releases of SQL Server Data Tools to support both database and business intelligence projects.

    SQL Server Data Tools - Business Intelligence for SQL Server 2014 Preview

    To support new business intelligence projects, we have recently released a Community Technology Preview (CTP) of the SQL Server Data Tools – Business Intelligence (SSDT-BI) tools that work with SQL Server 2014 CTP2. This CTP release can be downloaded here.

    SQL Server Data Tools - Business Intelligence (SSDT-BI) includes SQL Server Analysis Service Projects, SQL Server Reporting Services Projects, and SQL Server Integration Services Projects hosted in Visual Studio. SSDT-BI tools can be downloaded and installed as free tools from the web.  SSDT-BI tools: 

    • Integrate into existing Visual Studio installations for those developers that already have Visual Studio 2012 (Professional Edition or higher) installed.
    • Install as standalone tools, with the Visual Studio integrated IDE shell also being installed to host the various BI projects they support.

    As updates are available, the SSDT-BI tools do not automatically notify you of updates within the Visual Studio environment once installed, so users must visit the download site to obtain updates manually.  You can also always check here to make sure you have the latest release installed.

    SQL Server 2014 Business Intelligence Support and Visual Studio 2013 Support

    The existing preview version of SSDT-BI for SQL Server 2014 is based on Visual Studio 2012.  The commercial release of SSDT-BI is expected to be based on the newer Visual Studio 2013 in conjunction with the commercial release of the SQL Server 2014.  At that time, the existing SQL Server 2014 CTP 2 SSDT-BI that is based on Visual Studio 2012 will be retired and the new version of SSDT-BI that is based on Visual Studio 2013 will take its place.    

    SSDT-BI for SQL Server 2014 will support various SQL Server versions as follows:

    • SSAS projects can target SQL Server 2014 or lower
    • SSRS projects can target SQL Server 2014 or lower
    • SSIS projects can target only SQL Server 2014

    To target SSIS projects for SQL Server 2012, developers will have to continue to use the existing SSDT-BI version for SQL Server 2012, which is based on Visual Studio 2012. This version of SSDT-BI for SQL Server 2012 that is currently available will continue to be available after the SQL Server 2014 release.  

    SSDT-BI for Visual Studio 2012 supports versions of SQL Server as follows:

    • SSAS project can target SQL 2012 or lower
    • SSRS project can target SQL 2012 or lower
    • SSIS project can target only SQL 2012

    SQL Server Data Tools Database Projects Now Integrated within Visual Studio

    What was previously available as an add-in for Visual Studio 2010 and updated for Visual Studio 2012, the database developer toolset, is now an integrated feature of both Visual Studio 2012 and 2013. Along with being integrated in the full Visual Studio SKUs, database developers without Visual Studio can install them for free as standalone tools also.  These tools provide end-to-end support for building and managing Windows Azure SQL Databases and SQL Server databases. They provide SQL Server version-specific validation. The tools include live connected tools and designers, and the ability to create database projects that integrate with Visual Studio source control (including Team Foundation Server) for full schema development, debugging, database migrations and overall database lifecycle management. Over the past year, new features have been added including:

    • Database Unit Testing
    • Data Compare
    • Declarative Model Extensibility (Deployment and Build contributors)

    See below for a summarized map to acquire and update the core (non-BI) database tooling.

    SQL Server Database Tooling

    Visual Studio 2010

    Standalone, add-in (free)
    SQL Server Data Tools for Visual Studio 2010

    Updates
    In-shell notification for standalone, add-in.

    Visual Studio 2012

    Integrated
    Visual Studio Ultimate 2012
    Visual Studio Premium 2012
    Visual Studio Professional 2012

    Standalone, add-in (free)
    SQL Server Data Tools for Visual Studio 2012

    Updates
    In-shell notification for standalone, add-in.

    Visual Studio 2013

    Integrated
    Visual Studio Ultimate 2013
    Visual Studio Premium 2013
    Visual Studio Professional 2013

    http://www.microsoft.com/visualstudio/eng/downloads#d-2013-editions

    Standalone (free)
    Visual Studio Express 2013 for Windows
    Visual Studio Express 2013 for Web

    http://www.microsoft.com/visualstudio/eng/downloads

    Windows Azure SDK for Visual Studio 2013

    http://www.windowsazure.com/en-us/downloads/

    Updates

    In-shell notification & delivery

    SQL Server Data Tools for SQL Server 2014

    With the release of SQL Server 2014 CTP2, we have added support for SQL Server 2014 via a release of SQL Server Data Tools for SQL Server 2014 CTP2.     

    Summary

    We are hard at work finishing up our complete Visual Studio tooling for SQL Server 2014. Both the tools integrated with Visual Studio 2012 and 2013 and the BI tools have pre-releases available for SQL Server 2014 already.  At commercial release of SQL Server 2014, we plan to also commercially release updates to the database developer and BI tools for the commercial release of SQL Server 2014. 

  • SQL Server 2014 In-Memory OLTP: Nonclustered Indexes for Memory-Optimized Tables

    SQL Server 2014 CTP1 introduced hash indexes for memory-optimized tables. Hash indexes are very efficient for point lookups, when you know exactly the value you are looking for. However, they do not perform well if you need a range of value, for example a date range, or if you need to retrieve the rows in a particular order.

    Memory-optimized nonclustered indexes are a new index type for memory-optimized tables in CTP2. Nonclustered indexes support retrieving ranges of values, and also support retrieving the table rows in the order that was specified when the index was created. They can be used to optimize the performance of queries on memory-optimized tables that use inequality predicates like ‘<’ and ‘>’, and queries using an ORDER BY clause.

    Nonclustered indexes also support point lookups, but hash indexes still offer far better performance. You should continue to use hash indexes to optimize the performance of queries with equality predicates.

    Hash Index Limitations

    Consider, for example, the following (simplified) CREATE TABLE statement. The table SalesOrderHeader_inmem has a hash index on the primary key column ‘SalesOrderID’, as well as a hash index on the column ‘OrderDate’. The index definitions are bold-faced. Note that the BUCKET_COUNT should typically be set between one and two times the number of unique index key values.

    CREATE TABLE [Sales].[SalesOrderHeader_inmem](
    [SalesOrderID] uniqueidentifier NOT NULL
    PRIMARY KEY
    NONCLUSTERED HASH WITH (BUCKET_COUNT=1000000)
    DEFAULT (NEWID()),
    [RevisionNumber] [tinyint] NOT NULL CONSTRAINT 
    [IMDF_SalesOrderHeader_RevisionNumber] DEFAULT ((0)),
    [OrderDate] [datetime2] NOT NULL
    INDEX ix_OrderDate HASH WITH (BUCKET_COUNT=1000000)
    ) WITH (MEMORY_OPTIMIZED=ON)

    If you run a query of the form:

    SELECT * FROM Sales.SalesOrderHeader_inmem WHERE SalesOrderID = @ID

    SQL Server will use the hash index on the column SalesOrderID to quickly identify the row corresponding to the parameter @ID. Witness the Index Seek operation in the query plan:

     

    Hash indexes are very efficient for point lookup operations, because they require only a simple lookup in a hash table rather than traversing an index tree structure, as is required for traditional (non)clustered indexes. If you are looking for the rows corresponding to a particular index key value, using a hash index is the way to go.

    Now, hash indexes do have some limitations. Because of the nature of hash tables, rows appear in the index in random order. This means it is not possible to retrieve ranges of values, or to scan rows in any particular order. Therefore, hash indexes do not support Index Seek operations on inequality predicates, and do not support ordered scans. The former results in a full index scan, and the latter results in a scan followed by a(n expensive) sort operator. All this results in a potential performance degradation when using such queries with hash indexes.

    Consider the following two example queries:

    Query with an inequality predicate

    SELECT * FROM Sales.SalesOrderHeader_inmem 
    WHERE OrderDate > @Date

    The plan for this query does not use the index on OrderDate; it simply includes full index scan for the primary index. This means that SQL Server will process all the rows in the table, and only later filter out the ones with OrderDate > @Date.

     

    Query with an ORDER BY

    SELECT * FROM Sales.SalesOrderHeader_inmem 
    ORDER BY OrderDate

    The plan for this query includes a sort operator, which is very costly: after scanning the rows, all rows will need to be ordered to obtain the desired sort-order.

     

    Nonclustered indexes

    The new memory-optimized nonclustered indexes support both Index Seek operations using inequality predicates, and ordered scans. Consider the following amended CREATE TABLE statement, which now includes a nonclustered index on the column OrderDate.

    ‘SalesOrderID’, as well as a hash index on the column ‘OrderDate’. The index definitions are bold-faced.

    CREATE TABLE [Sales].[SalesOrderHeader_inmem](
    [SalesOrderID] uniqueidentifier NOT NULL
    PRIMARY KEY
    NONCLUSTERED HASH WITH (BUCKET_COUNT=1000000)
    DEFAULT (NEWID()),
    [RevisionNumber] [tinyint] NOT NULL CONSTRAINT 
    [IMDF_SalesOrderHeader_RevisionNumber] DEFAULT ((0)),
    [OrderDate] [datetime2] NOT NULL
    INDEX ix_OrderDate NONCLUSTERED
    ) WITH (MEMORY_OPTIMIZED=ON)

    Note that the keyword NONCLUSTERED is optional in this example. Also note that the syntax for memory-optimized nonclustered indexes is similar to the traditional disk-based nonclustered indexes. The only difference is that with memory-optimized indexes you need to specify the index with the CREATE TABLE statement, while with traditional indexes, you can create the index after creating the table.

    Consider now the following two queries:

    Query with an inequality predicate

    SELECT * FROM Sales.SalesOrderHeader_inmem 
    WHERE OrderDate > @Date

    The plan for this query now uses the index on OrderDate. And when inspecting the properties of the Index Seek operator, you will see the inequality predicate OrderDate > @Date. This means that SQL Server will only need to process the rows with OrderDate > @Date.

    Query with an ORDER BY

    SELECT * FROM Sales.SalesOrderHeader_inmem 
    ORDER BY OrderDate

    The plan for this query does not include a sort operator. SQL Server will scan the rows in the index in order; in this case the sort-order of the index is the same as the sort-order required by the query, thus no additional sort operator is needed.

     

    Note that, in contrast to disk-based nonclustered indexes, memory-optimized nonclustered indexes are uni-directional. This means that they support index scans only in the order that was specified when the index was created. If the ORDER BY clause in the above example would require OrderDate in DESC order, the index ix_OrderDate could not be used.

    Limitations on memory-optimized indexes

    Memory-optimized indexes do have a few limitations in SQL Server 2014 that we hope to address in future versions. You do need to consider these limitations when deciding on an indexing strategy for your memory-optimized tables.

    • At most 8 indexes – you cannot specify more than 8 indexes on a single memory-optimized table.
    • BIN2 collations – when using n(var)char columns in an index key, the columns must have a _BIN2 collation. Using BIN2 allows very fast lookup, as this can be based simply on binary comparison. However, you need to consider the implications of using a BIN2 collation, such as case and accent sensitivity. For more details see the Books Online topic on Collations and Code Pages.
    • NOT NULL columns in the index key – memory-optimized indexes do not support nullable columns; all columns in the index key must be declared as NOT NULL.

    Guidelines for choosing indexes

    A few simple guidelines to help choose the type of index you need for your application:

    • If you need to perform only point lookups, meaning you need to retrieve only the rows corresponding to a single index key value, use a hash index.
    • If you need to retrieve ranges of rows, or need to retrieve the rows in a particular sort-order, use a nonclustered index.
    • If you need to do both, particularly if point lookups are frequent, you can consider creating two indexes: it is possible to create both a hash and a nonclustered index with the same index key.

    Hash indexes require an estimate of the number of unique index key values, to determine an appropriate value for the BUCKET_COUNT – typically between 1X and 2X. Because nonclustered indexes do not require setting the BUCKET_COUNT, it can be tempting to simply use a nonclustered index instead of a hash index. However, this could lead to sub-optimal performance for point lookup operations. In many scenarios where point lookup performance is critical and there is no good estimate for the number of unique index key values, it is better to over-provision the BUCKET_COUNT: i.e., pick a very high bucket count that you know is going to be larger than the number of index keys. Note that over-provisioning does not affect the performance of point lookups, but under-provisioning does. However, over-provisioning does increase memory-consumption and it slows down full index scans.

    For more details on usage of memory-optimized indexes see the Books Online article on Guidelines for Using Indexes on Memory-Optimized Tables.

  • New In-Memory OLTP Sample for SQL Server 2014 CTP2

    To help you get started with the new In-Memory OLTP feature, we created a sample around sales order processing based on the AdventureWorks sample database.

    The sample, along with installation and usage instructions, are available on Codeplex:

    https://msftdbprodsamples.codeplex.com/releases/view/114491

    We encourage you to download and install the sample to become familiar with the new memory-optimized tables and natively compiled stored procedures, introduced by the In-Memory OLTP feature in SQL Server 2014 CTP2.

    We also included a number of demo stored procedures in the sample, which can be used to measure the performance of the sample on your system, and to contrast the performance of the new memory-optimized tables with traditional disk-based tables. Instructions for running the demo stored procedures, and for testing at scale, are in the documentation included with the sample.

    The sample is a work in progress, and we appreciate any and all feedback. You can post feedback and questions on the SQL Server Samples Forum.

    We will be updating the sample regularly based on your feedback, as we get closer to the RTM release.

  • IO Resource Governance in SQL Server 2014

    Resource Governor was introduced in SQL Server 2008 to achieve predictable performance in a multi-tenant environment, to support mixed workload patterns, to provide mechanisms and tools to deal with runaway queries, and to enforce resource limits for CPU and memory. It enables customers to implement database consolidation or to configure their own database as a service. Since then, we’ve been incrementally enhancing this feature in major releases to address the top customer requests in this area. In SQL Server 2014, we are excited to add support for IO resource governance.

    What is new?

    • Resource Pools now support configuration of minimum and maximum IOPS per volume in addition to CPU/Memory settings enabling more comprehensive resource isolation controls. See syntax details for Create and Alter resource pool with IO support.
    • Ability to configure the maximum outstanding IO per volume (at instance scope). This knob is meant to calibrate the resource governor feature for your storage subsystem.
    • As a note of clarification, by volume we refer to the disk volume as identified by Windows File system APIs
    • We’ve added new columns in the sys.dm_resource_governor_resource_pools and sys.dm_resource_governor_configuration dynamic management views to surface the IO usage and configuration. In addition, we’ve also added a new DMV called sys.dm_resource_governor_resource_pool_volumes that captures IO usage stats across different volumes used by the instance.
    • Two new XEvents (file_write_enqueued, file_read_enqueued) were added to monitor IO requests to the IO Resource Manager queues. These XEvents follow the conventions of the existing IO related events, such as issuing IO and IO completion
    • Last but not least, we’ve added relevant performance counters such as Disk Read IO/sec, Disk Read Bytes/sec, Avg Disk Read IO (ms), Disk Write IO/sec, Disk Write Bytes/sec, Avg Disk Write IO (ms), Disk Read IO Throttled/sec, Disk Write IO Throttled/sec to the SQLServer:Resource Pool Stats.

    How to use it?

    Let’s take the following scenario as an example of how IO Resource Governance can be used to control the resource usage in a SQL Server instance:

    Let’s suppose we are a Database hoster or running a Private cloud for database consolidation and we need to host multiple databases from different tenants (or customers) within a single SQL Server instance to achieve better density and COGS. If one of the tenants is running a very IO intensive workload, this can saturate the IO subsystem, causing performance problems to concurrent workloads that need to perform IO.

    The first step would be to create a Resource Pool for each tenant/database and a classifier function that will map the sessions from each tenant to the corresponding Resource Pool. For example, sessions for Customer 1 can be mapped to Resource Pool 1 and sessions for Customer 2 to Resource Pool 2.

    If you want to use IO Resource Governance, it is important to set the min or max IOPS setting for every Resource Pool so that the IO requests are redirected to the governance subsystem and minimum reservations can be honored. In the example below, we set the MAX_IOPS_PER_VOLUME to the maximum value for every Pool:

    -- Create 2 resource pools & 2 workload groups.
    CREATE RESOURCE POOL Customer1Pool;
    CREATE RESOURCE POOL Customer2Pool;
    GO

    CREATE WORKLOAD GROUP Customer1Group USING Customer1Pool;
    CREATE WORKLOAD GROUP Customer2Group USING Customer2Pool;
    GO

    -- Create classifier function
    CREATE FUNCTION fnUserClassifier()
    RETURNS SYSNAME
    WITH SCHEMABINDING
    AS
    BEGIN
    IF ORIGINAL_DB_NAME() = 'Customer1DB'
    BEGIN
    RETURN 'Customer1Group'
    END

    IF ORIGINAL_DB_NAME() = 'Customer2DB'
    BEGIN
    RETURN 'Customer2Group'
    END

    RETURN 'default'
    END;
    GO

    -- Set the classifier function and enable RG.
    ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION = dbo.fnUserClassifier);
    ALTER RESOURCE GOVERNOR RECONFIGURE;
    GO

    -- Set default values for the resource pools so that IO RG is enabled.
    ALTER RESOURCE POOL Customer1Pool WITH (MIN_IOPS_PER_VOLUME=0, MAX_IOPS_PER_VOLUME=2147483647);
    ALTER RESOURCE POOL Customer2Pool WITH (MIN_IOPS_PER_VOLUME=0, MAX_IOPS_PER_VOLUME=2147483647);
    ALTER RESOURCE GOVERNOR RECONFIGURE;
    GO

    Classifying each workload to a different Resource Pool allows us to configure the resource limits we want to provide for each tenant and also monitor the resource usage generated by their workload. The graph below (Performance Monitor) shows that the workload from Customer 1 is issuing too many IO requests causing a performance drop for Customer 2:

     

    In order to protect Customer 2 and guarantee that he gets consistent performance, regardless of the activity from other tenants, we can set the MIN_IOPS_PER_VOLUME setting for the corresponding Resource Pool. From the graph above, it seems that the system can handle around 1300 IOPS, so we decide to reserve 650 IOPS for Customer 2:

    ALTER RESOURCE POOL Customer2Pool WITH (MIN_IOPS_PER_VOLUME=650);
    ALTER RESOURCE GOVERNOR RECONFIGURE;
    GO

    With this configuration, SQL Server will try to throttle the workloads that are running in other Resource Pools, in order to satisfy the 650 IOPS reservation for Customer 2. In the graph below, we can see that the IOPS of the system are now fairly distributed among the tenants and that the performance for Customer 2 is back to normal despite its noisy neighbor:

     

    The MIN_IOPS_PER_VOLUME setting will make a reservation for the Resource Pool, but it won’t set any limit for the maximum IOPS it can use. This means that the tenants will still get performance variation depending on how active the rest of the tenants on the instance are. To avoid this problem and guarantee predictable performance, we can set the MAX_IOPS_PER_VOLUME setting for each of the tenants. This will set a hard limit for the maximum IOPS the tenant’s workload can use, guaranteeing predicable performance for the tenant, but also protecting the rest of the tenants on the instance:

    ALTER RESOURCE POOL Customer2Pool WITH (MAX_IOPS_PER_VOLUME=750)
    ALTER RESOURCE GOVERNOR RECONFIGURE
    GO

    By configuring the IO settings on the Resource Pools we can control the resources we want to provide for each tenant. This allows us to guarantee predictable performance regardless of the activity from other tenants or even provide differentiation in SLA for the Database Service based on the amount of the resources customers sign up to reserve.

    Another scenario that many of you might find applicable is isolating your OLTP workload from any maintenance operations that need to run in the database. Rebuilding an index, for example, is a common operation that can trigger a large number of IO requests, as it needs to scan the whole index or table. By using IO Resource Governance we can limit the number of IO operations these tasks can perform and guarantee predictable performance for concurrent OLTP workload.

    In this case, we need a Resource Pool dedicated for maintenance operations and a classifier function that will map maintenance sessions to the corresponding Resource Pool. Running these operation as a different user might be a good way to distinguish between regular and maintenance sessions. By configuring the MAX_IOPS_PER_VOLUME setting on the “maintenance” Resource Pool, we can limit the number of IO operations these tasks can perform and protect concurrent OLTP workload from being impacted.

    I hope the example scenarios above demonstrate the core value of this feature.

    Call to Action

    Please try this feature right away by downloading the SQL Server 2014 CTP2. Even more easily test it in on the SQL Server 2014 CTP2 images that are now available in Windows Azure. We look forward to hearing your feedback.

  • Get to Know the SQL Server that’s Purpose Built for High Performance Data Warehousing and Big Data Analytics – the SQL Server Parallel Data Warehouse Appliance

    If your business relies on data, you know that it is a constant challenge to store, manage, and analyze it effectively as your data continues to grow. It’s also expensive to keep enough data on “hot” storage where it is readily available for analysis. Even when you have the data you need on hot storage, it can take hours or even days to run analysis and reports on today’s symmetric multi-processing (SMP) systems. To add more to the challenges, businesses today are struggling to figure out how to add the value of non-relational Hadoop data into their analysis.

    As a result, business analysts are held back from making faster and more accurate data-driven business decisions that are needed to compete in today’s marketplace. This is the modern data challenge and one that some of our SQL Server customers are facing today. But help is here for those of you who know and love SQL Server (or even for those of you who don't).

    Get to know the SQL Server 2012 Parallel Data Warehouse (PDW), Microsoft’s next generation platform for data warehousing and Big Data integration. With PDW’s massively parallel processing (MPP) design, queries commonly complete 50 times faster than traditional data warehouses built on symmetric multi-processing (SMP) database management systems. 50 times faster means that queries complete in minutes instead of hours, or seconds instead of minutes. With this breakthrough performance, your business analysts can generate more extensive results faster, and can easily perform ad-hoc queries or drill down into the details. As a result, your business can make better decisions, faster.

    To learn more, go to http://www.microsoft.com/en-us/sqlserver/solutions-technologies/data-warehousing/upgrade-to-pdw.aspx.

  • In-Memory OLTP: High Availability for Databases with Memory-Optimized Tables

    Starting with SQL Server 2012, the new offering of AlwaysOn Availability Groups and AlwaysOn Failover Cluster  Instances (FCI) suite of features have enabled Tier-1 customers with mission critical applications to achieve their availability goals with SQL Server with an easy to deploy and manage solution. SQL Server 2014 builds on this success and offers enhanced AlwaysOn Availability Groups with up to 8 replicas, ability to access secondary replica for offloading reporting workload in disconnected scenario and hybrid scenario with Windows Azure.  

    In-memory OLTP is also targeting mission critical applications to deliver up to 30x better performance, and is integrated well with High Availability offerings SQL Server provides. The sections below review each of the High Availability offerings in the context of the in-memory OLTP solution. SQL Server 2014 offers four high-availability choices to customers as follows:

    • AlwaysOn Failover Cluster Instances (FCI): An FCI is a single instance of SQL Server that is installed across Windows Server Failover Clustering (WSFC) nodes and, possibly, across multiple subnets. On the network, an FCI appears to be an instance of SQL Server running on a single computer, but the FCI provides failover from one WSFC node to another if the current node becomes unavailable.

    In-Memory OLTP is fully integrated with FCI.  A database with memory-optimized tables can be configured and managed similar to how you would manage a database with no memory-optimized objects. However, memory-optimized tables in the database will likely add to the recovery time because these tables need to be loaded into memory before the database can be accessed. In-memory OLTP engine loads data/delta file in parallel to minimize the time taken. The time to load the memory_optimized tables depends upon the size of memory-optimized tables, the number of data/delta files, IOPS available to read data/delta files and number of CPUs/Cores available. It is a best practice to configure SQL Server instances with same amount of memory as, unlike for disk-based tables, insufficient memory will cause database recovery to fail on the new instance.  Please refer to How to manage your Memory for In-Memory OLTP for managing memory for database with memory-optimized tables

    • AlwaysON Availability Groups:  SQL Server 2014 allows up to 8 secondary replicas including 2 sync secondary replicas with options to auto-failover. You can offload reporting workloads and database/log backups to one or more secondary replicas. This allows all resources on the primary replica to be solely used by the production workload leading to increased throughput.  Each replica has its own copy of the databases in the availability group

    In-memory OLTP is fully integrated with AlwaysOn Availability. You can setup a database with memory-optimized tables in an availability group following exactly the same step that you would for a database without memory-optimized tables. Key points

      • Impact on failover time: The memory-optimized tables are maintained in-memory and kept up-to date by the redo thread as it processes transaction log records. This ensures that the failover to secondary replica will not incur additional overhead of loading memory-optimized tables’ in-memory. Hence, unlike with FCI, there is no impact to failover time.
      • Readable Secondary:  Querying memory-optimized tables both with natively compiled stored procedures and with SQL InterOP is supported on secondary replica. Unlike disk-based tables, the access to memory-optimized tables is done using optimistic concurrency therefore there is no need to map of isolation levels to snapshot isolation level done when accessing these tables. For this reason, reporting workload accessing memory-optimized tables are run without any isolation level mapping.  The restrictions on isolation levels and hints are the same both on primary and secondary replica. Please refer to Guidelines for Transaction Isolation Levels with Memory-Optimized Tables for details. This ensures any query or reporting workload that can be run on secondary replica will run without requiring any changes on the primary replica and vice-versa.  However, independent of the isolation level specified for reporting workload, SQL Server provides only read-committed isolation level consistency on secondary replicas for memory-optimized tables.  Please refer to Active Secondaries: Readable Secondary Replicas (AlwaysOn Availability Groups) for details. In CTP2, you need to enable TF-9989 to query in-memory OLTP tables on the secondary replica.
      • Non-durable tables: AlwaysOn Availability groups use transaction log transport to populate tables on the secondary replica. Since the changes to non-durable tables are not logged, these tables are not populated on the secondary replica. However, their schema will exist on replicas as part of the objects contained within the database.
      • Performance:  There are two aspects of performance. First, with the increased transactional throughput with in-memory OLTP, can the secondary replica keep up?  The short answer is that it will depend on the workload. A workload with high transactional throughput can put pressure on the log transport as well as on redo thread. With in-memory OLTP, to apply changes to memory-optimized tables, the redo thread will not incur IO related latency like it does for disk-based tables because the data is already in-memory. Second is the performance of reporting workload on the secondary replica. Its performance should be comparable to the performance on the primary replica. One limitation is that there is no auto-stats update on memory-optimized tables. You will need to manually update the statistics on the primary replica to generate the up to date statistics which can then be available on secondary replica after redo thread processes the log records associated with the statistics. Please refer to Statistics for Memory-Optimized Tables for details.
    • Log Shipping: SQL Server Log shipping allows you to automatically send transaction log backups from a primary database on a primary server instance to one or more secondary databases on separate secondary server instances. The transaction log backups are applied to each of the secondary databases individually.

    In-memory OLTP: Log shipping is fully supported on database with memory-optimized tables. A database with memory-optimized tables can be configured and managed similar to how you would manage a regular database

    • Transactional Replication:  It is a popular solution for offloading read workload one or more subscribers especially when only a subset of data, not the whole database, is needed on the subscriber. It is also used to a lesser extent for High Availability.

    In-memory OLTP: Replication is supported on databases with memory-optimized tables but the limitation is that you cannot use a memory-optimized table as an article in a publication or as part of a subscription. We will look into addressing this in a future release. In spite of this restriction, you can use memory-optimized tables in read-scale scenarios for read-mostly workloads as described below

    Scenario:  Let us say you have couple of disk-based tables t_disk1 and t_disk2 that are accessed heavily by read workloads. For your workload, you want to replicate these two tables on the subscriber however you want read-workload to run memory-optimized tables. You can achieve this by creating corresponding memory_optimized t_memopt1 and t_memopt2 and populating them using DML triggers on disk1 and t_disk2 tables.  Any insert/delete or the update of the row, the triggered action will update the t_memopt1 and t_memopt2 accordingly. Essentially, you have two copies of the same table, one is disk-based and other is memory-optimized. Now you can direct your read-workload to the memory-optimized tables.

  • Microsoft a Leader in Gartner’s Magic Quadrant for Operational Database Management Systems

    This is one of the best times to be working in SQL Server. I just came from the O’Reilly Strata + Hadoop World conference this week, where there was a great dialog about advances in data management techniques. We’re already the world’s most deployed database, we just released the second technology preview for SQL Server 2014, and if you saw Microsoft’s recent financial results you know SQL Server continues to outpace the market: revenue grew double-digits, with SQL Server Premium revenue growing more than 30%. And the SQL Server momentum is not stopping.

    On October 21st, Gartner published the Magic Quadrant for Operational Database Management Systems. In the opening summary, Gartner states, “The operational DBMS market (formerly OLTP) is evolving dramatically, with new, innovative entrants and incumbents supporting the growing use of unstructured data and NoSQL DBMS engines. Information management leaders must understand the market implications affecting DBMS technology decisions.”

    Microsoft is truly honored to be included in the Leaders Quadrant* for this Magic Quadrant based on our completeness of vision and ability to execute.

    I couldn’t agree more with Gartner, and our customers agree as well. The early previews for SQL Server 2014 have been downloaded more than 50,000 times and are being previewed by more than 100 customers; you can download the preview here. Check out their case studies to read more about how customers such as SBI Liquidity and TPP are taking advantage of the in-memory capabilities in SQL Server 2014. Ferranti is a great example of using both In-Memory OLTP and HDInsight technologies to process a continuous data flow of up to 200 million measurement channels, making the system fully capable of meeting the demands of smart meter technology. Go here  to read the full Gartner report. 

    Eron Kelly
    General Manager
    SQL Server Marketing

    *  Gartner, Magic Quadrant for Operational Database Management Systems by Donald Feinberg, Merv Adrian and Nick Heudecker, October 21, 2013.

    This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available at http://www.gartner.com/reprints/server-tools?id=1-1M9YEHW&ct=131028&st=sb.

    Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose. 

  • Revolutionizing City Planning in the 21st Century with Windows Azure, HDInsight, and SQL Server

    If you’ve ever played a video game where you manage a virtual city, you understand how real-time insight into services including emergency response teams and transportation is needed for effective city planning. Until recently, the technologies just weren’t available to support this kind of detailed “big picture” view enjoyed by gamers. However, now that Big Data business intelligence (BI) solutions are a reality, Barcelona, Spain is working to achieve transformative insight so that it can better meet the needs of its citizens. 

    To test the feasibility of a Big Data BI solution, Barcelona created a pilot that runs on a hybrid cloud based on Windows Azure, Windows Azure HDInsight Service, and Microsoft SQL Server 2012. With it, users can view real-time BI that combines petabytes of existing data in the city’s systems with new, unstructured public data sources such as Twitter, app log files, and GPS signals from cell phone providers. The BI is served up via mobile apps, dashboards, custom reports, and data services.

    The variety of technology choices in the Microsoft platform simplified development. For example, engineers met the diverse needs of users and devices using HDInsight and built-in tools in SQL Server, Microsoft Office, and Visual Studio. Engineers were also able to manage the project’s varied data and cost requirements with Apache Hadoop and Azure’s other storage options, which you can read about here.

    Everyone wins with this kind of solution. City employees are more productive because they quickly see how services are working from “big picture” views that include detailed numbers and social media feedback. Citizens enjoy a better quality of life because services like emergency teams, transportation, schools, and festivals are customized to meet real needs. Businesses can also increase success by quickly recognizing investment opportunities in specific neighborhoods. Other cities also benefit because they can adopt this repeatable smart-city infrastructure to transform insight, lifestyles, and economic success.

    You can learn more about Barcelona’s pilot solution by reading the case study and watching the video here.   

  • Customers get Faster and Better Insight into More Data from More Sources with Windows Azure HDInsight Service

    Collecting Big Data is significant only if you can make use of it. With Windows Azure HDInsight Service, organizations can use Apache Hadoop clusters in Windows Azure. Combining the Microsoft platform with open-source tools like Hadoop equates to unprecedented insight into more data, from more sources, in any format—structured and unstructured. Here are a few examples:

    Chr. Hansen develops food, pharmaceutical, and agricultural ingredients for global clients. By creating a hybrid cloud solution in less than one week with Windows Azure, HDInsight, and existing SQL Server 2012 databases, the company increased the number of trials it could analyze by 100 times. Read the full story here.

    Ascribe is a leading provider of clinical IT solutions in the United Kingdom. To recognize outbreaks of health threats faster, Ascribe created a hybrid cloud solution in just six weeks using Windows Azure, HDInsight, and SQL Server 2012. Now in just seconds, automated processes can analyze millions of records from disparate sources including social media feeds—and deliver potentially life-saving alerts to mobile devices and desktops. Read more here.

    Virginia Polytechnic Institute and State University (Virginia Tech) is a leading US research institution that recently implemented an on-demand, cloud computing model using HDInsight. With it, the university is saving money, and researchers working on new cancer therapies and antibiotics are accomplishing more, and enjoying easier access to DNA-sequencing tools and resources. The details are here.

    There are lots of Big Data and cloud-service options but what sets Windows Azure HDInsight Service apart is that it’s part of the larger end-to-end Microsoft platform, so you get more than just a service. You get built-in tools for creating highly effective, available, scalable, and affordable enterprise-ready solutions that include BI (supported by SQL Server and Microsoft Office) and mobile apps (created with Visual Studio). On top of that, Windows Azure solutions can run entirely in the cloud—or work with existing systems. You get to choose.

    To learn more about what’s possible with the Microsoft platform, read Microsoft’s Cloud OS Vision.

  • Windows Azure HDInsight Released for GA

    Today, Microsoft announced the general availability of Windows Azure HDInsight, its cloud-based distribution of Hadoop. Further expanding on Microsoft’s big data strategy, Microsoft CVP Quentin Clark will deliver a keynote at O’Reilly Strata + Hadoop World tomorrow, where he will discuss Microsoft’s vision to democratize big data by providing easy to use tools and a unified data platform. Clark further detailed this approach in a post on the Official Microsoft Blog.

    News Highlights from  Strata + Hadoop World Conference

    • Windows Azure HDInsight, available today, is an open and flexible platform that provides 100% Apache Hadoop as a service in the cloud. HDInsight builds on top of the Hortonworks Data Platform (HDP) to ensure full compatibility with Apache Hadoop. HDInsight offers customers:
      • Insights with familiar tools: Through integration with Microsoft’s business intelligence tools, HDInsight enables customers to easily analyze data in Excel.
      • Deployment Agility: With HDInsight customers can deploy and provision a Hadoop cluster, no matter the size, in minutes instead of hours or days, without losing any data.
      • Enterprise-ready Hadoop: HDInsight offers enterprise-class security and manageability with a secure node in every cluster and extensive support for PowerShell scripting.
      • Rich developer experience: Developers have a choice of languages including .NET, Java and others. .NET developers can exploit the full power of language-integrated query with LINQ to Hive, and database developers can use existing skills to query and transform data through Hive.
    • Customers such as Christian Hansen, City of Barcelona, and Virginia Tech are using HDInsight to give their data the power of Hadoop and the scalability of Windows Azure.
    • Interested customers can begin using Windows Azure HDInsight now.

    For more information on today’s news, check out Clark’s post on the Official Microsoft Blog. To follow the news from O’Reilly Strata + Hadoop World, follow the conversation on Twitter at @SQLServer and #strataconf.

     
  • A Look Back at PASS Summit 2013

    PASS Summit 2013 has been over for a little more than a week and we are already looking forward to what's next in 2014. PASS Summit is the world’s largest and most-intensive conference for Microsoft SQL Server and BI professionals and it just keeps growing!  We are pleased to have had over 5,000 registrations for this conference that featured over 190 technical sessions, 6 deep-dive Half-Day Sessions and 14 Pre-Conference Sessions. If you were not able to join us in Charlotte or missed the live stream, check out the PASSTV site to view the PASS Summit 2013 keynotes as well as gain insight to many of the sessions and learning provided at the conference. 

    To set the stage for this year’s 3-day conference, we heard first-hand from Microsoft Corporate Vice President, Quentin Clark and Microsoft Technical Fellow Gray Systems Labs', Dr. David DeWitt.

    Quentin Clark kicked off Day One with the keynote, "Microsoft’s Data Platform – The Road Ahead".  With an audience of over 5,000 viewing in person and via live stream, he discussed how customers are pushing the boundaries of what’s possible for businesses today using the advanced technologies in our data platform. Also during his discussion, he announced the second Community Technology Preview (CTP2) of SQL Server 2014 which features breakthrough performance with In-Memory OLTP and simplified backup and disaster recovery in Windows Azure.

    David DeWitt kicked of Day Two with a keynote on "Hekaton: Why, What, and How".  He shared deep insights on In-Memory OLTP's ability to provide performance improvements and transactional workloads.

    Outside of the great keynotes, participants attended a wide variety of technical sessions presented by our experts and industry leaders. The conference also featured Microsoft on-site certification testing,instructor-led workshops, technical guidance from Microsoft Customer Service and Support (CSS) engineers and hands-on support from Azure Customer Advisory Teams (CAT).

    So there you have it -- it's a wrap on PASS Summit 2013, but keep in mind that we are already planning PASS Business Analytics Conference taking place May 7-9 in San Jose, California and PASS Summit 2014 which will take place November 4-7th in Seattle, Washington. We hope to see you there!

  • Backup and Restore Enhancements in SQL Server 2014 CTP2

    SQL Server 2014 CTP2 includes three SQL Server Backup and Restore enhancements.  These enhancements enable using Windows Azure for backups, providing options to ease and automate backup management, and improved security for backups through encryption

    In addition, to enable backup to Windows Azure from previous versions of SQL Server, a stand-alone tool is now available.  A brief description of the tool and the link to the download center is included at the end of this article.

    The three enhancements are described in detail below:

    SQL Server Management Studio Support for SQL Server Backup to URL (Windows Azure storage)

    Support for SQL Server backup to and restore from the Windows Azure was introduced in SQL Server 2012 SP1 CU2 with Transact-SQL, SMO and PowerShell support. In the current release, the feature is further enhanced to include UI support.  Backup and Restore tasks, and maintenance plans now include URL as a destination option when backing up or restoring a database.

    Following are the UI changes in SSMS:

    Backup UI Changes: 

    1. When you right click a database and select Tasks, and choose Backup, the Back Up Database dialog that launches includes new options to support backup to Windows Azure. The dialog box has three pages, General, Media Options, and Backup Options. The General page now includes URL as one of the destination options. 

    Backup Database

      1. File Name: Name of the backup file.
      2. SQL Credential:  You can either specify an existing SQL Server Credential or create a new one by clicking the Create… button.  The new dialog that opens when you click create requires a management certificate or a publishing profile for the Azure subscription. If you do not have access to the management certificate or publishing profile, you can create a credential by specifying the Windows Azure storage account name and access keys.  In SSMS, from the instance, right click Security, select New, and select Credential to create a new SQL Credential.
      3. Azure storage container: The name of the Azure storage container.
      4. URL prefix: This is generated automatically using the information you provide in the above fields.  You don’t have to edit the information in this field. If however, you do, make sure it matches with the other information. For example. If you manually edit the storage URL, then the SQL Credential should have the authentication information for the storage specified.

    2.  If you click the Create... button, you will see a new dialog titled Backup to URL – Create Credential as shown below.

    Create Credential

      1. SQL Credential: Enter the name for the new SQL credential.  This credential is used by SQL Server to authenticate to the Windows Azure storage account.
      2. Windows Azure Credentials:  When using this dialog to create a credential, you must provide a Windows Azure Management Certificate added to the local certificate store or a publishing profile downloaded to your computer to validate the subscription and the storage account information.  If you use the publishing profile, the subscription and the certificate are automatically populated.
      3. Select, type, or paste your Windows Azure subscription ID that matches the management certificate from the local certificate store. 
      4. Storage Account: Select the account you want to use for storing the backups. The system stores this information in the SQL Credential that is created when you click Create.

    3.  When you select URL as the destination, you will only see certain options enabled on the Media Options page. If you want to overwrite a backup stored in Windows Azure storage, you can do so using Transact-SQL using the WITH FORMAT option.

     Transact SQL

    Maintenance Plan Wizard has similar changes to the Define a Backup Page to support backup to Windows Azure storage.

    Restore UI Changes: 

    In the Restore task, you now have the option of selecting URL as the device to restore from.  If you select URL, you are asked to specify the SQL Credential to authenticate to the Windows Azure storage. 

    1. When you select Device on the General page of the Restore task, the Backup Devices dialog that launches includes URL as a backup media Type.

     

    2.  When you select URL as the media type and click Add, you see a new dialog  -Connect to Windows Azure Storage, which allows you to specify the SQL Credential to authenticate to the storage account.

     
    3.  Once connected to the storage,  the backup files are retrieved from the storage.

     

    Here is the link to the topic that describes the feature in detail:  SQL Server Backup to URL

    SQL Server Managed Backup to Windows Azure Storage

    Built on top of the SQL Server Backup to URL technology, SQL Server 2014 CTP2 includes the option to let SQL Server create and manage the backup strategy. SQL Server creates a backup strategy intelligently, one that is based on recoverability to a point in time within the specified retention period and transaction activity on the database. The destination for the backup file is Windows Azure Storage.  Once you configure this feature either for a database or for an entire instance, SQL Server manages the backup frequency for full and log backups.  It supports point in time restore based on the retention time period you specify when configuring SQL Server Managed Backup. It also supports the use of encryption.  Encryption is also a new feature released in SQL Server 2014 CTP2 and is described later in this blog post.

    Following are sample Transact-SQL statements for configuring SQL Server Managed Backup to Windows Azure. 

    For a database:

    Use msdb;
    GO
    EXEC smart_admin.sp_set_db_backup
    @database_name='TestDB'
    ,@enable_backup=1
    ,@retention_days =30
    ,@credential_name ='MyCredential'
    ,@encryption_algorithm ='AES_256'
    ,@encryptor_type= 'Certificate'
    ,@encryptor_name='MyServerCert01';
    GO

    For an instance of SQL Server:

    Use msdb;
    GO
    EXEC smart_admin.sp_set_instance_backup
    @enable_backup=1
    ,@retention_days=30
    ,@credential_name='sqlbackuptoURL',
    ,@encryption_algorithm ='AES_128'
    ,@encryptor_type= 'Certificate'
    ,@encryptor_name='MyBackupCert';

    GO

    Important: To create a backup certificate to use when configuring smart backup, use the steps described in the Encrypted Backup section below.

    Here is the link to the topic that describes the feature in detail:  SQL Server Managed Backup to Windows Azure

    Encrypted Backup

    To secure you backup data, you can now select to encrypt when you create a backup.  The encryption options include an encryption algorithm and a certificate or asymmetric key to be used for the encryption.  Only asymmetric keys residing in the Extended Key Management is supported.

    Multiple encryption algorithms up to AES 256 bit are supported giving you a variety of algorithms to choose from. Adding backup encryption to a TDE encrypted database, gives additional protection for your data. 

    You can control the process using Transact-SQL, SMO, PowerShell, or SSMS. Encryption options can also be used when configuring SQL Server Managed Backup to Windows Azure, therefore providing additional security for your off-site data.

    It is very important that the certificate or key used for encrypting must be backed up to a different location than the backup that the certificate or key is used to encrypt.  Without this certificate or key the backup file cannot be restored.

    Important: Restore validates the thumbprint of the certificate during the restore operation. Therefore, the certificate used to create the backup must be retained in its original state.  Renewing the expiry date for example changes the thumbprint and thus can render the certificate unusable.

    Below are some simple steps you can follow to test out the feature. 

    1.  Create a Database Master Key for the master database on the instance

    USE master; 
    GO
    CREATE MASTER KEY ENCRYPTION BY PASSWORD = '23987hxJ#KL95234nl0zBef';

    2.  Create an encryption certificate 

    Use Master 
    GO
    CREATE CERTIFICATE MyTestDBBackupEncryptCert
    WITH SUBJECT = 'MyTestDB Backup Encryption Certificate';
     
    3. Backup a database and choose the encryption option, specify an encryption algorithm, specify the certificate to use.
    BACKUP DATABASE [MyTestDB] 
    TO DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\Backup\MyTestDB.bak'
    WITH
    ENCRYPTION
    (
    ALGORITHM = AES_256,
    SERVER CERTIFICATE = MyTestDBBackupEncryptCert
    ),
    STATS = 10
    GO

    Backup UI Changes:

    The Backup Options page of the Back Up Database Task dialog has the new encryption options.

     

     Maintenance Plan Wizard has similar changes to the Define a Backup Page to support backup to Windows Azure storage.

    Here is the topic that describes the feature in detail:  Backup Encryption 

    SQL Server Backup to Windows Azure Tool

    This stand-alone tool can be configured to redirect SQL Server backups to Windows Azure storage.  To be used for SQL Server versions that do not have the built in ability to backup to Windows Azure, it also includes the options to encrypt and compress during backup.

    For more information, see the download page.

  • Smart, Secure, Cost-Effective: SQL Server Back Up to Windows Azure

    Microsoft recently announced several new ways to back up and recover SQL Server databases with Windows Azure.  These features, now available in SQL Server 2014 CTP2 and as a standalone tool for prior versions, provide an easy path to cloud backup and disaster recovery for on-premises SQL Server databases.  The capabilities for backing up to Windows Azure Storage help to reduce storage costs and unlock the data protection and disaster recovery benefits of cloud data storage.

    Benefits of the new SQL Server Backup to Windows Azure include:

    • Cost-effective – Backing up to the cloud reduces CAPEX and OPEX by shifting from on-premises storage to Windows Azure Blob Storage Service.  Windows Azure offers lower TCO than many on-premises storage solutions and decreased administrative burden.
    • Secure – Backup to Windows Azure adds encryption to backups stored both in the cloud or on-premises, for an extra layer of security and compliance.
    • Durable – Backups and replicas in the cloud enable users to recover from local failures quickly and easily.  Windows Azure storage is reliable and built with data durability in mind: off-site, geo-redundant and easily accessible. 
    • Smart – SQL Server can now manage your back up schedule using the new Managed Backup to Windows Azure feature. It determines backup frequency based on database usage patterns.
    • Consistent – The combination of SQL Server 2014 in-box functionality and the Backup to Windows Azure Tool for prior versions create a single back up to cloud strategy across all your SQL databases.

    Simplified Cloud Backup

    There are several new capabilities enabling users to utilize SQL Server Backup and Disaster Recovery Windows Azure.

    • SQL Server Backup to Windows Azure – With SQL Server 2014, users can easily configure Azure backup storage. In the event of a failure, a backup can be restored to an on-premises SQL Server or one running in a Windows Azure Virtual Machine. Options for setting up backup include:
      • Manual Backup to Windows Azure - Users can configure back up to Windows Azure by creating a credential in SQL Server Management Studio (SSMS). These backups can be automated using backup policy. 
      • Managed Backup to Windows Azure – Managed Backup is a premium capability of Backup to Window Azure, measuring database usage and patterns to set the frequency of backups to Windows Azure to optimize networking and storage.  Managed Backup helps customers reduce costs while achieving greater data protection. 
      • Encrypted Backup – SQL Server 2014 offers users the ability to encrypt both on-premises backup and backups to Windows Azure for enhanced security.
    • SQL Server Backup to Windows Azure Tool - A stand-alone download that quickly and easily configures to back up to Windows Azure Storage for versions of SQL Server 2005 and forward. It can also encrypt backups stored either locally or in the cloud.

    Learn More

    SQL Server SQL Server 2014’s goal is to deliver mission critical performance along with faster insights into any data big or small. At the same time, it will enable new hybrid cloud solutions that can provide greater data protection and positively impact your bottom line. Early adopters are already leveraging new hybrid scenarios to extend their backup and disaster recovery capabilities around the globe without the need for additional storage replication technologies.

    If you would like to try the Backup and Recovery Enhancements in SQL Server 2014 CTP2, Thursday’s blog post will help you get started configuring and using these new capabilities.  You can also preview the SQL Server Backup to Windows Azure Tool, enabling backup to Windows Azure for SQL Server 2005 and forward.

    You can learn more about Hybrid Cloud scenarios in SQL Server 2014 by reading the SQL Server 2014 Hybrid Cloud White Paper, which is part of the SQL Server 2014 Product Guide. 

  • Customers Gain Roadmap for the Future with Microsoft, SQL Server 2014 and In-Memory OLTP

    There’s a lot of excitement around the In-Memory OLTP database engine in SQL Server 2014—and for good reason. Many companies are already realizing impressive performance gains by using it. Here are just a few examples:

    Edgenet manages data for Fortune 500 firms, and its success depends on updating clients’ databases so that they accurately reflect transactions, especially for databases supporting online shopping carts. By upgrading to SQL Server 2014, Edgenet boosted throughput sevenfold with In-Memory OLTP. Read the case study here.

    SBI Liquidity Market  manages more than US$1 trillion each month in foreign currency exchange trading. To increase scalability and speed, it upgraded its trading system to SQL Server 2014. With In-Memory OLTP, SBI Liquidity Markets improved throughput by 10x, processing up to 200,000 transactions per second. Read more about this story here.

    TPP manages about half of the United Kingdom’s 60 million medical records through its hosted clinician service. Fast access to medical records is critical.  In the most extreme scenarios, if a doctor can’t access the system fast enough, patient’s lives could be at stake.  When TPP started testing In-Memory OLTP by upgrading its databases to SQL Server 2014 managed to get the solution up and running in only half a day.  Although the databases were running on existing servers, TPP observed 7 times performance improvements.  Further details on their story are here.

    While In-Memory OLTP is transforming performance, it’s just one feature in the broader Microsoft platform that’s supporting modern applications’ demand for intense scale and ease of access.

    This end-to-end platform is what sets Microsoft apart, and that’s why I’d like to echo some points recently made by Satya Nadella, Microsoft’s Executive Vice President of Cloud and Enterprise. In 10 years, I believe Microsoft will still be at the core of computing because the company is continuing to shape the industry with innovations such as the hybrid cloud based on its Cloud OS. Microsoft also offers fast data insights with BI tools that “led the pack” according to Forrester Research. Ongoing adoption of Microsoft’s platform by the world’s top companies is also testament to the company’s vision.

    For more information about In-Memory OLTP including its no locking/no latching concurrency controls, read this article.  SQL Server 2014 CTP2 is now available for download here.