Official News from Microsoft’s Information Platform
Machine Learning Blog
Power BI
In the Storage Allocation and Management for Memory-Optimized Tables, we had briefly touched upon the Merge operation. This blog describes Merge operation in detail. We also recommend you to refer to implementing durability for memory-optimized tables for a good understanding of data/delta files and CFPs
The metadata of all Checkpoint File Pairs (i.e. CFP) that exist in storage is stored in an internal array structure referred to as Storage Array. It is a finitely sized (4096 entries in CTP2 and 8192 entries in RTM) array of CFPs to support a cumulative size of 256GB for durable memory-optimized tables in the database. The entries in the storage array ordered by transaction range. The CFPs in the storage array (along with the tail of the log) represent all the on-disk state required to recover a database with memory-optimized tables.
Running an OLTP workload overtime can lead to storage array to fill up leading to many inefficiencies as described here
To alleviate these inefficiencies, the older closed CFPs are merged based on a merge policy (described below) so the storage array is compacted to represent the same set of data, with reduced number of CFPs and the storage.
The Merge operation takes one or more closed CFPs, called MERGE SOURCE(s), based on an internally defined merge policy, described later, and produces one resultant CFP, called MERGE TARGET. The entries in each delta file of source CFP(s) are used to filter rows from the corresponding data file to remove the data rows that are not needed anymore. The remaining rows in the source CFPs are consolidated into one target CFP. After the merge is complete, the resultant CFP (i.e. the merge-target) replaces the source CFPs (i.e. the merge sources). The merge-source CFPs go through a transition phase before they are eventually removed from the storage.
In the example below, the memory-optimized table file group has four data and delta file pairs at timestamp 500 containing data from previous transactions. For example, the rows in the first data file correspond to transactions with timestamp > 100 and <=200 or alternatively represented as (100, 200]. The second and third data files are shown to be less than 50% full after accounting for the rows marked deleted. The merge operation combines these two CFPs and creates a new CFP containing transactions with timestamp > 200 and <=400, which is the combined range of these two CFPs. You see another CFP with range (500, 600] and non-empty delta file for transaction range (200, 400] shows that Merge operation can be done concurrently with transactional activity including deleting more rows from the source CFPs.
In-memory OLTP engine, a background thread evaluates all closed CFPs using a merge policy and then initiates one or more merge requests for the qualifying CFPs. These merge requests are processed by the offline checkpoint thread. The evaluation of merge policy is done periodically and also when a checkpoint is closed.
SQL Server 2014 implements the following merge policy
A maximum of 10 consecutive CFPs can be part of one merge operation.
Here are some examples that show the CFPs that will be merged under the merge policy.
Adjacent CFPs Source Files (% full)
Merge Selection
CFP0 (30%), CFP1 (50%), CFP2 (50%), CFP3 (90%)
(CFP0, CFP1). CFP2 is not chosen as it will make resultant data file > 100% of the ideal size
CFP0 (30%), CFP1 (20%), CFP2 (50%), CFP3 (10%)
(CFP0, CFP1, CFP2). Files are chosen starting from left. CTP3 is not combined as it will make resultant data file > 100% of the ideal size
CFP0 (80%), CFP1 (30%), CFP2 (10%), CFP3 (40%)
(CFP1, CFP2, CFP3). Files are chosen starting from left. CFP0 is skipped because if combined with CFP1, the resultant data file will be > 100% of the ideal size
Not all CFPs with available space qualify for merge. For example, if two adjacent CFPs are 60% full, they will not qualify for merge which will result in 40% of wasted storage for these CFPs. In the worst case, all CFPs can be little over 50% full leading to storage utilization of only 50% approximately. It is important to note that deleted row(s) may exist in the storage, but they may have already been removed (i.e. garbage collected) from the memory. The management of storage and the memory is somewhat independent from garbage collection perspective. A thumb rule is that storage taken by ‘Active’ CFPs (note, it does not account for all CFPs) can be up to 2x larger than the size of durable tables in memory.
You can force merge using stored procedure sys.sp_xtp_merge_checkpoint_files (Transact-SQL) to override the merge policy. For details, please refer to http://msdn.microsoft.com/en-us/library/dn198330(v=sql.120).aspx
As mentioned earlier, once the CFPs are merged, they go through a state transition before they can be deallocated. At any given time, the CFPs data/delta file pairs can be in the following states
After accounting for the storage taken by CFPs in various states, the overall storage taken by durable memory-optimized tables can be much larger than 2x. The DMV sys.dm_db_xtp_checkpoint_files can be queried to list all the CFPs in memory-optimized filegroup including the phase they are in. Transitioning CFPs from MERGE_SOURCE state to TOMBSTONE and ultimately getting garbage collected can take up to 5 checkpoints and transaction log backup, if database is configured for full or bulk-logged recovery model, steps. You can, of course, manually force the checkpoint followed by log backup to expedite the garbage collection but then this will add 5 empty CFPs (i.e. 5 data/delta file pairs with data file of size 128MB each). In production scenarios, the automatic checkpoints and log backups taken as part of backup strategy will seamlessly transition CFPs through various states without requiring any manual intervention. The impact of the garbage collection process is that in-memory databases may have a disproportionate storage footprint compared to its size in memory. It is not uncommon to have the size storage occupied by memory-optimized tables to be 3x to their size in memory.
In the next blog, we will walk through an example showing transition of CFPs across all phases.
Today, Hortonworks announced the general availability of the Hortonworks Data Platform (HDP) 2.0 for Windows. This is the next release of the industry’s only 100% Apache Hadoop-based distribution for Windows. With Windows Server leading all operating system in usage share (two out of every three servers run Windows Server), HDP 2.0 for Windows dramatically broadens the accessibility of Hadoop.
This release is important because it is engineered on the latest version of Apache Hadoop 2.2. It will take advantage of the power of YARN, offer improved high availability and mean time to recovery, and incorporate phase 2 of the Stinger initiative and Hbase 0.96 to deliver Snapshots.
As part of our overall big data approach, we are deeply invested in ensuring customers have choice and interoperability when building and running Hadoop-based solutions. This includes providing customers the choice to deploy Hadoop data in the cloud on Windows Azure. Applications built on HDP for Windows on-prem can be easily deployed in the cloud for platform flexibility. Additionally, customers can leverage Power BI for Office 365 to connect to either HDP 2.0 for Windows or Windows Azure HDInsight to empower Microsoft Excel users with compelling insights on big data.
We’re partnering closely with Hortonworks to bring the community new Hadoop solutions for Windows. HDP for Windows continues our joint commitment to opening up the Hadoop platform to as many people as possible.
You can find out more about today’s announcement by:
Herain Oberoi Director of Product Marketing
As described in implementing durability for memory optimized tables, the storage requirements and its management for memory optimized tables is very different compared to the disk-based tables. If you are migrating a subset of your disk-based tables to memory-optimized tables or are just developing a new application with memory-optimized tables, you will notice these differences. Unlike disk-based tables in SQL Server that use 8k pages to store data in the same format both on disk and in memory, the memory-optimized tables utilize a different size and format both in-memory and on disk. This has been a point of concern by many customers as they have observed, sometimes, disproportionate storage consumed by memory optimized tables. The goal of this blog is to help alleviate this confusion by describing how storage is managed over the lifetime of data rows.
The data for memory optimized tables is stored in one or more data/delta file pairs (also referred to as checkpoint file pairs or CFP) with data file(s) storing inserted rows and delta file(s) referencing deleted rows. During the execution of an OLTP workload, as the DML operations update, insert, and delete rows, new data/delta files are created to persist the data. Also, existing delta files are updated to process delete of existing rows. Over time, if the number of active rows, after accounting for deleted rows, in two or more consecutive CFPs falls below a threshold (usually < 50%) such that they can be merged into one CFP of 128 MB, they are merged automatically by a background merge process into a new CFP. Once the merge operation is complete, the older CFPs go through a transition phase and are eventually removed (i.e. garbage collected) from the storage. Note, SQL Server 2014 CTP2 supports up to 4096 CFPs within a database but this limit will be changed to 8192 CFPs in RTM
At any given time, the data/delta file pairs are in one of the following 4 categories
Transitioning CFPs out of category-3 and category-4 can take up to 5 checkpoints and transaction log backup steps, if not running in simple recovery mode. For in-memory databases, the automatic checkpoint is taken when the size of transaction log exceeds 512MB since the last checkpoint. You can, of course, manually force the checkpoint followed by log backup to expedite the garbage collection but then this will add 5 empty CFPs (i.e. 5 data/delta file pairs with data file of size 128MB each). Typically, in test environment, you may have to force transaction log backups and checkpoints to remove the CFPs that are not needed. In production scenarios, however, we don’t expect customers to do manual checkpoint but to rely on the automatic checkpoints and log backups that are taken as part of backup strategy. The impact of this garbage collection process is that in-memory databases may have a disproportionate storage footprint compared to its size in memory.
To clarify the point further, let us walk through some examples of in-memory databases under various scenarios and compare their size in-memory and on storage. For these examples, we will assume we are running a SQL instance with 8 logical processor with 1 memory optimized table with following schema. Note, this has a row size of approximately 8KB.
CREATE TABLE dbo.t_memopt (
c1 int NOT NULL,
c2 char(40) NOT NULL,
c3 char(8000) NOT NULL,
CONSTRAINT [pk_t_memopt_c1] PRIMARY KEY NONCLUSTERED HASH (c1) WITH
(BUCKET_COUNT = 100000)
) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
As indicated earlier, the fixed storage overhead for this configuration will be 1 GB (i.e. 8 data files of 128MB each).
Name memory_used_by_indexes_kb memory_used_by_table_kb
---------- ------------------------- -----------------------
t1 1024 8
Though these examples were over simplified but they do illustrate the difference in storage provisioning for memory optimized tables over disk-based tables. Your storage requirement will depend upon following key factors such
SQL Server 2014 In-Memory OLTP introduces two new widely advertised kinds of database objects: memory-optimized tables for efficient data access, and natively compiled stored procedures for efficient query processing and business logic execution. However, there is a third kind of object introduced by In-Memory OLTP: the memory-optimized table type.
The use of memory-optimized table variables has a number of advantages over traditional table variables:
The typical usage scenarios for memory-optimized table variables are:
Like memory-optimized tables, SQL Server generates a DLL for each memory-optimized table type. As is the case for memory-optimized tables, the DLL includes the functions for access indexes and retrieving data from the table variables. When a memory-optimized table variable is declared based on the table type, an instance of the table and index structures corresponding to the table type is created in the user session. The table variable can then be used in the same way as traditional table variables: you can insert/update/delete rows in the table variable, and you can use the variables in T-SQL queries. You can also pass them into natively compiled as well as traditional stored procedures, as table-valued parameters (TVP).
The following sample shows a memory-optimized table type from the AdventureWorks-based In-Memory OLTP sample.
CREATE TYPE [Sales].[SalesOrderDetailType_inmem] AS TABLE(
[OrderQty] [smallint] NOT NULL,
[ProductID] [int] NOT NULL,
[SpecialOfferID] [int] NOT NULL,
[LocalID] [int] NOT NULL,
INDEX [IX_ProductID] HASH ([ProductID])
WITH ( BUCKET_COUNT = 8),
INDEX [IX_SpecialOfferID] HASH ([SpecialOfferID])
WITH ( BUCKET_COUNT = 8)
)
WITH ( MEMORY_OPTIMIZED = ON )
As you can see, the syntax is very similar to traditional table variables. A few things to keep in mind:
The following example script shows declaration of a table variable as the memory-optimized table type Sales.SalesOrderDetailType_inmem, insert of three rows into the variable, and passing the variable as a TVP into Sales.usp_InsertSalesOrder_inmem.
DECLARE @od Sales.SalesOrderDetailType_inmem,
@SalesOrderID uniqueidentifier,
@DueDate datetime2 = SYSDATETIME()
INSERT @od (LocalID, ProductID, OrderQty, SpecialOfferID) VALUES
(1,888,2,1),
(2,450,13,1),
(3,841,1,1)
EXEC Sales.usp_InsertSalesOrder_inmem
@SalesOrderID = @SalesOrderID,
@DueDate = @DueDate,
@OnlineOrderFlag = 1,
@SalesOrderDetails = @od
Note that memory-optimized table types can be used as the type for stored procedure table-valued parameters (TVPs), and can be referenced by clients in exactly the same way as traditional table types and TVPs. Therefore, invocation of stored procedures with memory-optimized TVPs, and indeed natively compiled stored procedure, works exactly the same way as the invocation of traditional stored procedures with traditional TVPs.
In SQL Server 2014, memory-optimized tables do not support cross-database transactions. This means you cannot access another database from the same transaction or the same query that also accesses a memory-optimized table. This means you cannot straightforwardly copy data from a table in one database, to a memory-optimized table in another database.
Table variables are not transactional. Therefore, memory-optimized table variables can be used in cross-database queries, and can thus facilitate moving data from one database into memory-optimized tables in another. The idea is to use two transactions: in the first transaction you insert the data from the remote table into the variable; in the second transaction you insert the data into the local memory-optimized table from the variable.
For example, if you want to copy the row from table t1 in database db1 to table t2 in db2, using variable @v1 of type dbo.tt1, you would use something like:
USE db2
GO
DECLARE @v1 dbo.tt1
INSERT @v1 SELECT * FROM db1.dbo.t1
INSERT dbo.t2 SELECT * FROM @v1
Those of you who have tried the new version of SQL Server Management Studio might have noticed already that it has a couple of new wizards added. One of those wizards is sitting next to the already existing that has been allowing you to deploy to a Windows Azure SQL Database service (former SQL Azure) and it is called Deploy Database to a Windows Azure Virtual Machine (VM).
You might be wondering why this wizard is here and what it can do. The blog post below tries to provide the answer to this and some of the related questions.
We have seen many time that when people are trying a Virtual Machine in the Windows Azure environment they quickly come up with a number of questions. Once the first set of questions is resolved the next wave of questions come up, then the next wave, and a few next waves.
I have tried to compile the list of things you could expect.
Those steps are describing the things need to happen to end up with your workload in Windows Azure VM. There are multiple small details in this process, same as a lot of things to consider. You can consult this article if you need more information on the subject.
You might have question how the process above is related to the wizard described in this post. This is exactly what Deploy Database to a Windows Azure VM wizard does for you, plus a few additional convenience items.
Let’s see how this looks in the wizard.
The first screen tries to provide you an idea of what information might be useful for you during the use of the wizard.
The Source Settings screen only want you to provide with two pieces of information:
Once you told the wizard where to get the data from you might want to connect to Windows Azure. There are 3 different options:
Once all authentication information is in place we connect to a Windows Azure environment, so you can start configuring it. There are a few things you could do:
When you type a new VM name and press the Settings… button a new dialog comes up. If you have seen Windows Azure Portal before you should be familiar with most fields in this new dialog.
Please note: we are trying to warn you if we think that the target SQL Server instance might be lower version than the source instance. Sometimes those are hard errors that doesn’t allow you to continue, when we know for sure that the final configuration will not be compatible, while sometimes we can only guess. In either case the suggestion is to do not ignore those warnings:
When you select the VM that has existed before this makes the dialog look differently.
Please note two things here:
Once you chose settings for a new VM or connected to an existing one the Target Database section became active. Please note that Database name field is editable meaning you can change your mind about database name on the target machine.
Now you are just two easy steps away from having your database in the cloud. The first step is to confirm the selection you have made when following the wizard:
And a second step is to wait until process completes. Depends on the database size and your connection speed the time might differ, but wizard will make sure it is done.
And finally you would see the confirmation screen that also contains a link to the detailed log.
This log is important for most of the troubleshooting actions, same as for digging for details.
Deploy Database to a Windows Azure VM wizard has guided you through the deployment process. Now this is the right time to observe what the end result is look like.
However, a few manual steps might be needed.
The Deploy Database to a Windows Azure VM wizard is in place to simplify the steps needed to have a database in a Windows Azure VM environment. The whole wizard is made around the scenarios when you need your database in the cloud as soon as possible and willing to skip some studying or preparation steps as a tradeoff for the speed of the deployment. The other assumption was that you don’t want to do a deployment manually and prefer SQL Server to do some of this work for you.
The wizard was made as an extremely safe to your data. There is no chance it could delete, overwrite, or even modify your data. All operations it does are read only for the data.
We really hope this new feature will help you to save time during your deployments.
With the release of SQL Server 2014 CTP2, you can now significantly boost the performance of your OLTP workloads in Windows Azure Virtual Machines. By creating a new VM with our preloaded image of SQL Server 2014 CTP2 on Windows Server 2012 R2, or installing SQL Server 2014 CTP2 on your VM, In-Memory OLTP functionalities are immediately available to you. This blog post provides a good guide on how to create a Windows Azure VM.
However, since the transition to In-Memory OLTP is not as simple as flipping a switch, you must carefully evaluate your application scenario and see if it is the right solution for you.
For SQL Server 2014 CTP 2, we recommend the following scenarios for SQL Server In-Memory OLTP on a Windows Azure Virtual Machine:
However, if your workload is suffering from long log I/O latency or if it is under pressure from log I/O throughput, and at the same time you require strict durability of your data, In-Memory OLTP on Windows Azure VM will not alleviate these problems.
SQL Server 2014 CTP2 on Windows Azure VM is not suitable for testing the performance of applications deployed in machines with similar configuration on premise.
The selection of VM sizes is important for any workload running in the IaaS space, whether you are provisioning a new VM using our preloaded CTP2 image or adding CTP2 onto an existing instance. Windows Azure provides a selection of virtual machine sizes for a variety of purposes, as listed in this article.
Since In-Memory OLTP is designed to serve extreme high session concurrency and the nature of memory-resident data requires sufficient memory space, we recommend the following Windows Azure VM sizes for adopting In-Memory OLTP:
Compute Instance Name
Virtual Cores
RAM
Extra Large (A4)
8
14 GB
A6
4
28 GB
A7
56 GB
The exact instance size that you will choose will depend on the scenario you wish to run and the size of data you wish to make memory-resident in SQL Server In-Memory OLTP. We recommend provisioning 100% more memory than the estimated size of data in memory and leave sufficient space for the buffer pool used by disk-based workloads on the same machine. This article on SQL Server Books Online has more information on how to estimate the size of a table in memory, and we have published a blog post on hardware considerations for In-Memory OLTP, some of which apply to provisioning VMs as well.
To configure a Windows Azure Virtual Machine for best performance in In-Memory OLTP, we suggest that you should follow the best practices outlined in this whitepaper. A summary of key considerations, plus some unique attributes for In-Memory OLTP, are listed below:
If you haven’t yet, be sure to check out the SQL Server 2014 CTP2? Essentially feature complete, this second public community technical preview enables you to try and test all of the capabilities of the full SQL Server 2014 release. To help get you trained on the new capabilities, Microsoft Virtual Academy (MVA) has two lively, demo filled events planned. First on December 3rd, Mission Critical Performance with SQL Server 2014 will show how SQL Server 2014 will help you improve performance speed by 10-30 times, using your current hardware. And you’ll improve reliability at the same time. Then on December 4th, we’ll cover the benefits of moving your organization to the cloud, and how that can provide cost benefits at the same time as increasing scale and flexibility. Microsoft experts will demonstrate how you can make that move one step at a time, using SQL Server 2014 to create a hybrid environment.
Mission Critical Performance with SQL Server 2014
Date: December 3, 2013Time: 9:00am – 5:00pmWhere: Live, online virtual classroomCost: Free!
Register now
Platform for Hybrid Cloud with SQL Server 2014
Date: December 4, 2013Time: 9:00am – 5:00pmWhere: Live, online virtual classroomCost: Free!
In recent surveys by TDWI Research, roughly half of respondents report that they will replace their primary data warehouse (DW) platform and/or analytic tools within three years. Ripping out and replacing a DW or analytics platform is expensive for IT budgets and intrusive for business users. This raises the question: What circumstances would lead so many people down such a dramatic path?
It’s because many organizations need a more modern DW platform to address a number of new and future business and technology requirements. Most of the new requirements relate to big data and advanced analytics, so the data warehouse of the future must support these in multiple ways. Hence, a leading goal of the modern data warehouse is to enable more and bigger data management solutions and analytic applications, which in turn helps the organization automate more business processes, operate closer to real time, and through analytics learn valuable new facts about business operations, customers, products, and so on.
For organizations that need a modern data warehouse that satisfies new and future requirements, TDWI offers a checklist of our top six recommendations. These can guide your selection of vendor products and your solution design.
Users facing new and future requirements for big data, analytics, and real-time operation need to start planning today for the data warehouse of the future. To help them prepare, this TDWI Checklist Report drills into each of the six recommendations, listing and discussing many of the new vendor product types, functionality, and user best practices that will be common in the near future, plus the business case and technology strengths of each.
This checklist was sponsored by Microsoft.
To read the full report, sign up and download the report here.
Last week, we highlighted a whitepaper that focused on the performance benefits of SQL Server PDW and how it differs from traditional SQL Server. We’ve seen many SQL Server customers evolve to this latest PDW as their next generation platform for their data warehouse infrastructure for many of the reasons noted in that whitepaper and this week, we want to follow-up on that by sharing our guide for how to migrate from a traditional data warehouse to PDW.
This whitepaper is based of real world best practices and will showcase how you can leverage your existing knowledge and skills around SQL Server to deploy SQL Server PDW as your next generation solution for your data warehouse. Download the paper to find out more.
Last week, we introduced you to SQL Server PDW, the version of SQL Server built specifically for high performance data warehousing that delivers performance gains of up to 50x compared to traditional data warehouses. The next logical question we often get is “how is this possible?” Is it just SQL Server running on special hardware? And the answer is yes…but there is a lot more to it than that.
PDW has a shared nothing, Massively Parallel Processing (MPP) architecture allows for better performance when loading and querying data simultaneously as well as performing the types of complex querying that is common amongst today’s analytically driven enterprises. The MPP architecture is also what enables PDW’s seamless linear scalability from terabytes to 6 petabytes of data. This is a fundamentally different architecture than traditional, Symmetric Multi-Processing (SMP) that traditional SQL Server is based on and this MPP architecture is really at the heart of what’s different about PDW and how it’s able to deliver the ground breaking performance that it does.
To learn more about how PDW works and some of the performance gains we’ve seen in typical data warehouse scenarios, we encourage you to download this whitepaper today!
Memory-optimized tables must fully reside in memory and can’t be paged out. Unlike disk-based tables where insufficient memory can slowdown an application, the impact to memory-optimized tables upon encountering out-of-memory can be severe, causing DML (i.e. delete, insert or update) operations to fail. While this adds a new dimension to managing memory, the application failure due to resource errors is not something new. For example, applications using disk-based tables can fail with resource errors such as running out of transaction log or TempDB or out of storage. It is the responsibility of DBAs/Administrators to make sure resources are provisioned and managed appropriately to avoid such failures. SQL Server provides a rich set of monitoring tools, including DMVs, PerfMon and XEvents to help administrators identify problems earlier so that a corrective action can be taken. Similarly, for memory-optimized tables, SQL Server provides a rich set of monitoring capabilities and configuration options so that you can manage your database/instance well and keep your application running smoothly. The remainder of this blog walks thru each of the challenges and details how it can be addressed.
This is the first question that you need consider when migrating an existing table(s) to memory-optimized table(s) or when you are considering a new application using memory-optimized tables. When migrating a disk-based table, you know, for example using sp_spaceused as described in http://technet.microsoft.com/en-us/library/ms188776.aspx , its current size so it is just a simple mathematical calculation to find the corresponding size for memory-optimized tables. The key differences to be aware of are that memory-optimized tables cannot compressed like disk-based tables with ROW and PAGE compression, so, the memory-optimized table will likely be bigger. However, unlike indexes for disk-based tables, the indexes on memory tables are much smaller. For example, the index key is not stored with hash indexes and all indexes are, by definition, covered indexes. Please refer to http://msdn.microsoft.com/en-us/library/dn247639(v=sql.120).aspx for details. A more challenging task is to estimate the data growth. While you can make a reasonable guess, the best way is the continuous to monitor the table size and the memory consumed by memory-optimized table (s) in your database and instance. The same monitoring approach holds for new applications that are created with in-memory OLTP in mind.
The in-memory OLTP engine is integrated with SQL Server. Memory allocations to memory-optimized tables are managed by SQL Server Memory Manager and the allocated memory is tracked using familiar constructs and tools such as memory clerks and DMVs. The following DMV shows XTP memory clerks. The first row shows the memory allocated by system threads. The second row with name DB_ID_5 represents the consumers in the database objects and the third row with memory node-id 64 represents memory allocated to DAC (Dedicated Admin Connection).
-- Show the memory allocated to in-mempory OLTP objects
select type, name, memory_node_id, pages_kb/1024 as pages_MB
from sys.dm_os_memory_clerks where type like '%xtp%'
type name memory_node_id pages_MB
-------------------- ---------- -------------- --------------------
MEMORYCLERK_XTP Default 0 18
MEMORYCLERK_XTP DB_ID_5 0 1358
MEMORYCLERK_XTP Default 64 0
Also, there are new DMVs that can be used to monitor the memory consumed by the in-memory OLTP engine and memory-optimized tables. Please refer to http://msdn.microsoft.com/en-us/library/dn133203(v=sql.120).aspx for details.
Like any other memory consumer, the in-memory OLTP engine responds to memory-pressure, but to a limited degree. For example, the memory consumed by data and indexes can’t be released even under memory pressure. This is different than disk-based tables where an external memory pressure may cause the buffer pool to shrink which simply means there will be fewer data/index pages in memory. For this reason, it is all the more important to provision the memory for memory-optimized tables appropriately, otherwise in-memory OLTP engine can starve other memory consumers including the memory needed by SQL Server for its operations which can ultimately leads to slow or unresponsive application. To address this, SQL provides a configuration option to limit the memory consumed by memory-optimized tables.
Starting with SQL Server 2014, you can bind a database to a Resource Pool. This binding is only relevant when the database has one or more memory-optimized table. The memory available in the resource pool controls the total memory available to memory-optimized tables in the database.
For example, create a resource pool, mem_optpool as follows
CREATE RESOURCE POOL mem_optpool WITH (MAX_MEMORY_PERCENT = 40);
Now map the database, mydatabase, to this resource pool by executing the following command. With this command, you are specifying that the total memory taken by memory-optimized tables and indexes cannot exceed the limit in the resource pool. So for this case, the other 60% memory is available to other consumers.
EXEC sp_xtp_bind_db_resource_pool 'mydatabase', 'mem_optpool'
When configuring memory for memory-optimized tables, the capacity planning should be done based on MIN_MEMORY_PERCENT, not on MAX_MEMORY_PERCENT. This provides more predictable memory availability for memory-optimized tables as pools that have the min_memory_percent option set can cause memory pressure notifications against other pools to ensure the minimum percentage is honored.. To ensure that memory is available for the In-Memory OLTP database and help avoid OOM (Out of Memory) conditions, the values for MAX_MEMORY_PERCENT and MIN_MEMORY_PERCENT should be the same. SQL Server target memory is dynamic relative to the OS and setting a minimum memory would be recommended only if the server is not dedicated. For details, please refer to http://msdn.microsoft.com/en-us/library/dn465873(v=sql.120).aspx.
The rows for memory-optimized tables are stored in-memory and are linked through Hash and non-clustered indexes as described http://msdn.microsoft.com/en-us/library/dn133190(v=sql.120).aspx. Concurrent access to memory-optimized table uses optimistic concurrency control based on row versions. Over time, the existing rows may get updated (update operation generates a row version(s)) and deleted but these rows can’t immediately be removed as there may be concurrent transactions that need these rows versions. These older row versions are garbage collected (GC’d) asynchronously when it is determined, based on the active transactions, that they are no longer needed. There is a GC system thread that shares the row version cleanup (i.e. GC) with user transaction activity to ensure that SQL Server is able to keep up with the GC. When you configure the memory for your workload, you must account for additional memory needed for stale row versions. You can roughly estimate the memory needed for stale row versions using the following steps:
You can use DMVs and Perfmon counters to monitor the progress of Garbage collection. Please refer to http://msdn.microsoft.com/en-us/library/dn133203(v=sql.120).aspx.
With the release of SQL Server 2014 Community Technology Preview 2 (CTP2) and the commercial release of Visual Studio 2013, we are excited to announce new preview releases of SQL Server Data Tools to support both database and business intelligence projects.
SQL Server Data Tools - Business Intelligence for SQL Server 2014 Preview
To support new business intelligence projects, we have recently released a Community Technology Preview (CTP) of the SQL Server Data Tools – Business Intelligence (SSDT-BI) tools that work with SQL Server 2014 CTP2. This CTP release can be downloaded here.
SQL Server Data Tools - Business Intelligence (SSDT-BI) includes SQL Server Analysis Service Projects, SQL Server Reporting Services Projects, and SQL Server Integration Services Projects hosted in Visual Studio. SSDT-BI tools can be downloaded and installed as free tools from the web. SSDT-BI tools:
As updates are available, the SSDT-BI tools do not automatically notify you of updates within the Visual Studio environment once installed, so users must visit the download site to obtain updates manually. You can also always check here to make sure you have the latest release installed.
SQL Server 2014 Business Intelligence Support and Visual Studio 2013 Support
The existing preview version of SSDT-BI for SQL Server 2014 is based on Visual Studio 2012. The commercial release of SSDT-BI is expected to be based on the newer Visual Studio 2013 in conjunction with the commercial release of the SQL Server 2014. At that time, the existing SQL Server 2014 CTP 2 SSDT-BI that is based on Visual Studio 2012 will be retired and the new version of SSDT-BI that is based on Visual Studio 2013 will take its place.
SSDT-BI for SQL Server 2014 will support various SQL Server versions as follows:
To target SSIS projects for SQL Server 2012, developers will have to continue to use the existing SSDT-BI version for SQL Server 2012, which is based on Visual Studio 2012. This version of SSDT-BI for SQL Server 2012 that is currently available will continue to be available after the SQL Server 2014 release.
SSDT-BI for Visual Studio 2012 supports versions of SQL Server as follows:
SQL Server Data Tools Database Projects Now Integrated within Visual Studio
What was previously available as an add-in for Visual Studio 2010 and updated for Visual Studio 2012, the database developer toolset, is now an integrated feature of both Visual Studio 2012 and 2013. Along with being integrated in the full Visual Studio SKUs, database developers without Visual Studio can install them for free as standalone tools also. These tools provide end-to-end support for building and managing Windows Azure SQL Databases and SQL Server databases. They provide SQL Server version-specific validation. The tools include live connected tools and designers, and the ability to create database projects that integrate with Visual Studio source control (including Team Foundation Server) for full schema development, debugging, database migrations and overall database lifecycle management. Over the past year, new features have been added including:
See below for a summarized map to acquire and update the core (non-BI) database tooling.
SQL Server Database Tooling
Visual Studio 2010
Standalone, add-in (free) SQL Server Data Tools for Visual Studio 2010
Updates In-shell notification for standalone, add-in.
Visual Studio 2012
Integrated Visual Studio Ultimate 2012 Visual Studio Premium 2012 Visual Studio Professional 2012
Standalone, add-in (free) SQL Server Data Tools for Visual Studio 2012
Visual Studio 2013
Integrated Visual Studio Ultimate 2013 Visual Studio Premium 2013 Visual Studio Professional 2013
http://www.microsoft.com/visualstudio/eng/downloads#d-2013-editions
Standalone (free) Visual Studio Express 2013 for Windows Visual Studio Express 2013 for Web
http://www.microsoft.com/visualstudio/eng/downloads
Windows Azure SDK for Visual Studio 2013
http://www.windowsazure.com/en-us/downloads/
Updates
In-shell notification & delivery
SQL Server Data Tools for SQL Server 2014
With the release of SQL Server 2014 CTP2, we have added support for SQL Server 2014 via a release of SQL Server Data Tools for SQL Server 2014 CTP2.
We are hard at work finishing up our complete Visual Studio tooling for SQL Server 2014. Both the tools integrated with Visual Studio 2012 and 2013 and the BI tools have pre-releases available for SQL Server 2014 already. At commercial release of SQL Server 2014, we plan to also commercially release updates to the database developer and BI tools for the commercial release of SQL Server 2014.
SQL Server 2014 CTP1 introduced hash indexes for memory-optimized tables. Hash indexes are very efficient for point lookups, when you know exactly the value you are looking for. However, they do not perform well if you need a range of value, for example a date range, or if you need to retrieve the rows in a particular order.
Memory-optimized nonclustered indexes are a new index type for memory-optimized tables in CTP2. Nonclustered indexes support retrieving ranges of values, and also support retrieving the table rows in the order that was specified when the index was created. They can be used to optimize the performance of queries on memory-optimized tables that use inequality predicates like ‘<’ and ‘>’, and queries using an ORDER BY clause.
Nonclustered indexes also support point lookups, but hash indexes still offer far better performance. You should continue to use hash indexes to optimize the performance of queries with equality predicates.
Consider, for example, the following (simplified) CREATE TABLE statement. The table SalesOrderHeader_inmem has a hash index on the primary key column ‘SalesOrderID’, as well as a hash index on the column ‘OrderDate’. The index definitions are bold-faced. Note that the BUCKET_COUNT should typically be set between one and two times the number of unique index key values.
CREATE TABLE [Sales].[SalesOrderHeader_inmem]( [SalesOrderID] uniqueidentifier NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT=1000000) DEFAULT (NEWID()), [RevisionNumber] [tinyint] NOT NULL CONSTRAINT [IMDF_SalesOrderHeader_RevisionNumber] DEFAULT ((0)), [OrderDate] [datetime2] NOT NULL INDEX ix_OrderDate HASH WITH (BUCKET_COUNT=1000000) ) WITH (MEMORY_OPTIMIZED=ON)
If you run a query of the form:
SELECT * FROM Sales.SalesOrderHeader_inmem WHERE SalesOrderID = @ID
SQL Server will use the hash index on the column SalesOrderID to quickly identify the row corresponding to the parameter @ID. Witness the Index Seek operation in the query plan:
Hash indexes are very efficient for point lookup operations, because they require only a simple lookup in a hash table rather than traversing an index tree structure, as is required for traditional (non)clustered indexes. If you are looking for the rows corresponding to a particular index key value, using a hash index is the way to go.
Now, hash indexes do have some limitations. Because of the nature of hash tables, rows appear in the index in random order. This means it is not possible to retrieve ranges of values, or to scan rows in any particular order. Therefore, hash indexes do not support Index Seek operations on inequality predicates, and do not support ordered scans. The former results in a full index scan, and the latter results in a scan followed by a(n expensive) sort operator. All this results in a potential performance degradation when using such queries with hash indexes.
Consider the following two example queries:
Query with an inequality predicate
SELECT * FROM Sales.SalesOrderHeader_inmem WHERE OrderDate > @Date
The plan for this query does not use the index on OrderDate; it simply includes full index scan for the primary index. This means that SQL Server will process all the rows in the table, and only later filter out the ones with OrderDate > @Date.
Query with an ORDER BY
SELECT * FROM Sales.SalesOrderHeader_inmem ORDER BY OrderDate
The plan for this query includes a sort operator, which is very costly: after scanning the rows, all rows will need to be ordered to obtain the desired sort-order.
The new memory-optimized nonclustered indexes support both Index Seek operations using inequality predicates, and ordered scans. Consider the following amended CREATE TABLE statement, which now includes a nonclustered index on the column OrderDate.
‘SalesOrderID’, as well as a hash index on the column ‘OrderDate’. The index definitions are bold-faced.
CREATE TABLE [Sales].[SalesOrderHeader_inmem]( [SalesOrderID] uniqueidentifier NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT=1000000) DEFAULT (NEWID()), [RevisionNumber] [tinyint] NOT NULL CONSTRAINT [IMDF_SalesOrderHeader_RevisionNumber] DEFAULT ((0)), [OrderDate] [datetime2] NOT NULL INDEX ix_OrderDate NONCLUSTERED ) WITH (MEMORY_OPTIMIZED=ON)
Note that the keyword NONCLUSTERED is optional in this example. Also note that the syntax for memory-optimized nonclustered indexes is similar to the traditional disk-based nonclustered indexes. The only difference is that with memory-optimized indexes you need to specify the index with the CREATE TABLE statement, while with traditional indexes, you can create the index after creating the table.
Consider now the following two queries:
The plan for this query now uses the index on OrderDate. And when inspecting the properties of the Index Seek operator, you will see the inequality predicate OrderDate > @Date. This means that SQL Server will only need to process the rows with OrderDate > @Date.
The plan for this query does not include a sort operator. SQL Server will scan the rows in the index in order; in this case the sort-order of the index is the same as the sort-order required by the query, thus no additional sort operator is needed.
Note that, in contrast to disk-based nonclustered indexes, memory-optimized nonclustered indexes are uni-directional. This means that they support index scans only in the order that was specified when the index was created. If the ORDER BY clause in the above example would require OrderDate in DESC order, the index ix_OrderDate could not be used.
Memory-optimized indexes do have a few limitations in SQL Server 2014 that we hope to address in future versions. You do need to consider these limitations when deciding on an indexing strategy for your memory-optimized tables.
A few simple guidelines to help choose the type of index you need for your application:
Hash indexes require an estimate of the number of unique index key values, to determine an appropriate value for the BUCKET_COUNT – typically between 1X and 2X. Because nonclustered indexes do not require setting the BUCKET_COUNT, it can be tempting to simply use a nonclustered index instead of a hash index. However, this could lead to sub-optimal performance for point lookup operations. In many scenarios where point lookup performance is critical and there is no good estimate for the number of unique index key values, it is better to over-provision the BUCKET_COUNT: i.e., pick a very high bucket count that you know is going to be larger than the number of index keys. Note that over-provisioning does not affect the performance of point lookups, but under-provisioning does. However, over-provisioning does increase memory-consumption and it slows down full index scans.
For more details on usage of memory-optimized indexes see the Books Online article on Guidelines for Using Indexes on Memory-Optimized Tables.
To help you get started with the new In-Memory OLTP feature, we created a sample around sales order processing based on the AdventureWorks sample database.
The sample, along with installation and usage instructions, are available on Codeplex:
https://msftdbprodsamples.codeplex.com/releases/view/114491
We encourage you to download and install the sample to become familiar with the new memory-optimized tables and natively compiled stored procedures, introduced by the In-Memory OLTP feature in SQL Server 2014 CTP2.
We also included a number of demo stored procedures in the sample, which can be used to measure the performance of the sample on your system, and to contrast the performance of the new memory-optimized tables with traditional disk-based tables. Instructions for running the demo stored procedures, and for testing at scale, are in the documentation included with the sample.
The sample is a work in progress, and we appreciate any and all feedback. You can post feedback and questions on the SQL Server Samples Forum.
We will be updating the sample regularly based on your feedback, as we get closer to the RTM release.
Resource Governor was introduced in SQL Server 2008 to achieve predictable performance in a multi-tenant environment, to support mixed workload patterns, to provide mechanisms and tools to deal with runaway queries, and to enforce resource limits for CPU and memory. It enables customers to implement database consolidation or to configure their own database as a service. Since then, we’ve been incrementally enhancing this feature in major releases to address the top customer requests in this area. In SQL Server 2014, we are excited to add support for IO resource governance.
What is new?
How to use it?
Let’s take the following scenario as an example of how IO Resource Governance can be used to control the resource usage in a SQL Server instance:
Let’s suppose we are a Database hoster or running a Private cloud for database consolidation and we need to host multiple databases from different tenants (or customers) within a single SQL Server instance to achieve better density and COGS. If one of the tenants is running a very IO intensive workload, this can saturate the IO subsystem, causing performance problems to concurrent workloads that need to perform IO.
The first step would be to create a Resource Pool for each tenant/database and a classifier function that will map the sessions from each tenant to the corresponding Resource Pool. For example, sessions for Customer 1 can be mapped to Resource Pool 1 and sessions for Customer 2 to Resource Pool 2.
If you want to use IO Resource Governance, it is important to set the min or max IOPS setting for every Resource Pool so that the IO requests are redirected to the governance subsystem and minimum reservations can be honored. In the example below, we set the MAX_IOPS_PER_VOLUME to the maximum value for every Pool:
-- Create 2 resource pools & 2 workload groups. CREATE RESOURCE POOL Customer1Pool; CREATE RESOURCE POOL Customer2Pool; GO CREATE WORKLOAD GROUP Customer1Group USING Customer1Pool; CREATE WORKLOAD GROUP Customer2Group USING Customer2Pool; GO -- Create classifier function CREATE FUNCTION fnUserClassifier() RETURNS SYSNAME WITH SCHEMABINDING AS BEGIN IF ORIGINAL_DB_NAME() = 'Customer1DB' BEGIN RETURN 'Customer1Group' END IF ORIGINAL_DB_NAME() = 'Customer2DB' BEGIN RETURN 'Customer2Group' END RETURN 'default' END; GO -- Set the classifier function and enable RG. ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION = dbo.fnUserClassifier); ALTER RESOURCE GOVERNOR RECONFIGURE; GO -- Set default values for the resource pools so that IO RG is enabled. ALTER RESOURCE POOL Customer1Pool WITH (MIN_IOPS_PER_VOLUME=0, MAX_IOPS_PER_VOLUME=2147483647); ALTER RESOURCE POOL Customer2Pool WITH (MIN_IOPS_PER_VOLUME=0, MAX_IOPS_PER_VOLUME=2147483647); ALTER RESOURCE GOVERNOR RECONFIGURE; GO
Classifying each workload to a different Resource Pool allows us to configure the resource limits we want to provide for each tenant and also monitor the resource usage generated by their workload. The graph below (Performance Monitor) shows that the workload from Customer 1 is issuing too many IO requests causing a performance drop for Customer 2:
In order to protect Customer 2 and guarantee that he gets consistent performance, regardless of the activity from other tenants, we can set the MIN_IOPS_PER_VOLUME setting for the corresponding Resource Pool. From the graph above, it seems that the system can handle around 1300 IOPS, so we decide to reserve 650 IOPS for Customer 2:
ALTER RESOURCE POOL Customer2Pool WITH (MIN_IOPS_PER_VOLUME=650); ALTER RESOURCE GOVERNOR RECONFIGURE; GO
With this configuration, SQL Server will try to throttle the workloads that are running in other Resource Pools, in order to satisfy the 650 IOPS reservation for Customer 2. In the graph below, we can see that the IOPS of the system are now fairly distributed among the tenants and that the performance for Customer 2 is back to normal despite its noisy neighbor:
The MIN_IOPS_PER_VOLUME setting will make a reservation for the Resource Pool, but it won’t set any limit for the maximum IOPS it can use. This means that the tenants will still get performance variation depending on how active the rest of the tenants on the instance are. To avoid this problem and guarantee predictable performance, we can set the MAX_IOPS_PER_VOLUME setting for each of the tenants. This will set a hard limit for the maximum IOPS the tenant’s workload can use, guaranteeing predicable performance for the tenant, but also protecting the rest of the tenants on the instance:
ALTER RESOURCE POOL Customer2Pool WITH (MAX_IOPS_PER_VOLUME=750) ALTER RESOURCE GOVERNOR RECONFIGURE GO
By configuring the IO settings on the Resource Pools we can control the resources we want to provide for each tenant. This allows us to guarantee predictable performance regardless of the activity from other tenants or even provide differentiation in SLA for the Database Service based on the amount of the resources customers sign up to reserve.
Another scenario that many of you might find applicable is isolating your OLTP workload from any maintenance operations that need to run in the database. Rebuilding an index, for example, is a common operation that can trigger a large number of IO requests, as it needs to scan the whole index or table. By using IO Resource Governance we can limit the number of IO operations these tasks can perform and guarantee predictable performance for concurrent OLTP workload.
In this case, we need a Resource Pool dedicated for maintenance operations and a classifier function that will map maintenance sessions to the corresponding Resource Pool. Running these operation as a different user might be a good way to distinguish between regular and maintenance sessions. By configuring the MAX_IOPS_PER_VOLUME setting on the “maintenance” Resource Pool, we can limit the number of IO operations these tasks can perform and protect concurrent OLTP workload from being impacted.
I hope the example scenarios above demonstrate the core value of this feature.
Call to Action
Please try this feature right away by downloading the SQL Server 2014 CTP2. Even more easily test it in on the SQL Server 2014 CTP2 images that are now available in Windows Azure. We look forward to hearing your feedback.
If your business relies on data, you know that it is a constant challenge to store, manage, and analyze it effectively as your data continues to grow. It’s also expensive to keep enough data on “hot” storage where it is readily available for analysis. Even when you have the data you need on hot storage, it can take hours or even days to run analysis and reports on today’s symmetric multi-processing (SMP) systems. To add more to the challenges, businesses today are struggling to figure out how to add the value of non-relational Hadoop data into their analysis.
As a result, business analysts are held back from making faster and more accurate data-driven business decisions that are needed to compete in today’s marketplace. This is the modern data challenge and one that some of our SQL Server customers are facing today. But help is here for those of you who know and love SQL Server (or even for those of you who don't).
Get to know the SQL Server 2012 Parallel Data Warehouse (PDW), Microsoft’s next generation platform for data warehousing and Big Data integration. With PDW’s massively parallel processing (MPP) design, queries commonly complete 50 times faster than traditional data warehouses built on symmetric multi-processing (SMP) database management systems. 50 times faster means that queries complete in minutes instead of hours, or seconds instead of minutes. With this breakthrough performance, your business analysts can generate more extensive results faster, and can easily perform ad-hoc queries or drill down into the details. As a result, your business can make better decisions, faster.
To learn more, go to http://www.microsoft.com/en-us/sqlserver/solutions-technologies/data-warehousing/upgrade-to-pdw.aspx.
Starting with SQL Server 2012, the new offering of AlwaysOn Availability Groups and AlwaysOn Failover Cluster Instances (FCI) suite of features have enabled Tier-1 customers with mission critical applications to achieve their availability goals with SQL Server with an easy to deploy and manage solution. SQL Server 2014 builds on this success and offers enhanced AlwaysOn Availability Groups with up to 8 replicas, ability to access secondary replica for offloading reporting workload in disconnected scenario and hybrid scenario with Windows Azure.
In-memory OLTP is also targeting mission critical applications to deliver up to 30x better performance, and is integrated well with High Availability offerings SQL Server provides. The sections below review each of the High Availability offerings in the context of the in-memory OLTP solution. SQL Server 2014 offers four high-availability choices to customers as follows:
In-Memory OLTP is fully integrated with FCI. A database with memory-optimized tables can be configured and managed similar to how you would manage a database with no memory-optimized objects. However, memory-optimized tables in the database will likely add to the recovery time because these tables need to be loaded into memory before the database can be accessed. In-memory OLTP engine loads data/delta file in parallel to minimize the time taken. The time to load the memory_optimized tables depends upon the size of memory-optimized tables, the number of data/delta files, IOPS available to read data/delta files and number of CPUs/Cores available. It is a best practice to configure SQL Server instances with same amount of memory as, unlike for disk-based tables, insufficient memory will cause database recovery to fail on the new instance. Please refer to How to manage your Memory for In-Memory OLTP for managing memory for database with memory-optimized tables
In-memory OLTP is fully integrated with AlwaysOn Availability. You can setup a database with memory-optimized tables in an availability group following exactly the same step that you would for a database without memory-optimized tables. Key points
In-memory OLTP: Log shipping is fully supported on database with memory-optimized tables. A database with memory-optimized tables can be configured and managed similar to how you would manage a regular database
In-memory OLTP: Replication is supported on databases with memory-optimized tables but the limitation is that you cannot use a memory-optimized table as an article in a publication or as part of a subscription. We will look into addressing this in a future release. In spite of this restriction, you can use memory-optimized tables in read-scale scenarios for read-mostly workloads as described below
Scenario: Let us say you have couple of disk-based tables t_disk1 and t_disk2 that are accessed heavily by read workloads. For your workload, you want to replicate these two tables on the subscriber however you want read-workload to run memory-optimized tables. You can achieve this by creating corresponding memory_optimized t_memopt1 and t_memopt2 and populating them using DML triggers on disk1 and t_disk2 tables. Any insert/delete or the update of the row, the triggered action will update the t_memopt1 and t_memopt2 accordingly. Essentially, you have two copies of the same table, one is disk-based and other is memory-optimized. Now you can direct your read-workload to the memory-optimized tables.
This is one of the best times to be working in SQL Server. I just came from the O’Reilly Strata + Hadoop World conference this week, where there was a great dialog about advances in data management techniques. We’re already the world’s most deployed database, we just released the second technology preview for SQL Server 2014, and if you saw Microsoft’s recent financial results you know SQL Server continues to outpace the market: revenue grew double-digits, with SQL Server Premium revenue growing more than 30%. And the SQL Server momentum is not stopping.
On October 21st, Gartner published the Magic Quadrant for Operational Database Management Systems. In the opening summary, Gartner states, “The operational DBMS market (formerly OLTP) is evolving dramatically, with new, innovative entrants and incumbents supporting the growing use of unstructured data and NoSQL DBMS engines. Information management leaders must understand the market implications affecting DBMS technology decisions.”
Microsoft is truly honored to be included in the Leaders Quadrant* for this Magic Quadrant based on our completeness of vision and ability to execute.
I couldn’t agree more with Gartner, and our customers agree as well. The early previews for SQL Server 2014 have been downloaded more than 50,000 times and are being previewed by more than 100 customers; you can download the preview here. Check out their case studies to read more about how customers such as SBI Liquidity and TPP are taking advantage of the in-memory capabilities in SQL Server 2014. Ferranti is a great example of using both In-Memory OLTP and HDInsight technologies to process a continuous data flow of up to 200 million measurement channels, making the system fully capable of meeting the demands of smart meter technology. Go here to read the full Gartner report.
Eron Kelly General Manager SQL Server Marketing
* Gartner, Magic Quadrant for Operational Database Management Systems by Donald Feinberg, Merv Adrian and Nick Heudecker, October 21, 2013.
This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available at http://www.gartner.com/reprints/server-tools?id=1-1M9YEHW&ct=131028&st=sb.
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
If you’ve ever played a video game where you manage a virtual city, you understand how real-time insight into services including emergency response teams and transportation is needed for effective city planning. Until recently, the technologies just weren’t available to support this kind of detailed “big picture” view enjoyed by gamers. However, now that Big Data business intelligence (BI) solutions are a reality, Barcelona, Spain is working to achieve transformative insight so that it can better meet the needs of its citizens.
To test the feasibility of a Big Data BI solution, Barcelona created a pilot that runs on a hybrid cloud based on Windows Azure, Windows Azure HDInsight Service, and Microsoft SQL Server 2012. With it, users can view real-time BI that combines petabytes of existing data in the city’s systems with new, unstructured public data sources such as Twitter, app log files, and GPS signals from cell phone providers. The BI is served up via mobile apps, dashboards, custom reports, and data services.
The variety of technology choices in the Microsoft platform simplified development. For example, engineers met the diverse needs of users and devices using HDInsight and built-in tools in SQL Server, Microsoft Office, and Visual Studio. Engineers were also able to manage the project’s varied data and cost requirements with Apache Hadoop and Azure’s other storage options, which you can read about here.
Everyone wins with this kind of solution. City employees are more productive because they quickly see how services are working from “big picture” views that include detailed numbers and social media feedback. Citizens enjoy a better quality of life because services like emergency teams, transportation, schools, and festivals are customized to meet real needs. Businesses can also increase success by quickly recognizing investment opportunities in specific neighborhoods. Other cities also benefit because they can adopt this repeatable smart-city infrastructure to transform insight, lifestyles, and economic success.
You can learn more about Barcelona’s pilot solution by reading the case study and watching the video here.
Collecting Big Data is significant only if you can make use of it. With Windows Azure HDInsight Service, organizations can use Apache Hadoop clusters in Windows Azure. Combining the Microsoft platform with open-source tools like Hadoop equates to unprecedented insight into more data, from more sources, in any format—structured and unstructured. Here are a few examples:
Chr. Hansen develops food, pharmaceutical, and agricultural ingredients for global clients. By creating a hybrid cloud solution in less than one week with Windows Azure, HDInsight, and existing SQL Server 2012 databases, the company increased the number of trials it could analyze by 100 times. Read the full story here.
Ascribe is a leading provider of clinical IT solutions in the United Kingdom. To recognize outbreaks of health threats faster, Ascribe created a hybrid cloud solution in just six weeks using Windows Azure, HDInsight, and SQL Server 2012. Now in just seconds, automated processes can analyze millions of records from disparate sources including social media feeds—and deliver potentially life-saving alerts to mobile devices and desktops. Read more here.
Virginia Polytechnic Institute and State University (Virginia Tech) is a leading US research institution that recently implemented an on-demand, cloud computing model using HDInsight. With it, the university is saving money, and researchers working on new cancer therapies and antibiotics are accomplishing more, and enjoying easier access to DNA-sequencing tools and resources. The details are here.
There are lots of Big Data and cloud-service options but what sets Windows Azure HDInsight Service apart is that it’s part of the larger end-to-end Microsoft platform, so you get more than just a service. You get built-in tools for creating highly effective, available, scalable, and affordable enterprise-ready solutions that include BI (supported by SQL Server and Microsoft Office) and mobile apps (created with Visual Studio). On top of that, Windows Azure solutions can run entirely in the cloud—or work with existing systems. You get to choose.
To learn more about what’s possible with the Microsoft platform, read Microsoft’s Cloud OS Vision.
Today, Microsoft announced the general availability of Windows Azure HDInsight, its cloud-based distribution of Hadoop. Further expanding on Microsoft’s big data strategy, Microsoft CVP Quentin Clark will deliver a keynote at O’Reilly Strata + Hadoop World tomorrow, where he will discuss Microsoft’s vision to democratize big data by providing easy to use tools and a unified data platform. Clark further detailed this approach in a post on the Official Microsoft Blog.
News Highlights from Strata + Hadoop World Conference
For more information on today’s news, check out Clark’s post on the Official Microsoft Blog. To follow the news from O’Reilly Strata + Hadoop World, follow the conversation on Twitter at @SQLServer and #strataconf.
PASS Summit 2013 has been over for a little more than a week and we are already looking forward to what's next in 2014. PASS Summit is the world’s largest and most-intensive conference for Microsoft SQL Server and BI professionals and it just keeps growing! We are pleased to have had over 5,000 registrations for this conference that featured over 190 technical sessions, 6 deep-dive Half-Day Sessions and 14 Pre-Conference Sessions. If you were not able to join us in Charlotte or missed the live stream, check out the PASSTV site to view the PASS Summit 2013 keynotes as well as gain insight to many of the sessions and learning provided at the conference.
To set the stage for this year’s 3-day conference, we heard first-hand from Microsoft Corporate Vice President, Quentin Clark and Microsoft Technical Fellow Gray Systems Labs', Dr. David DeWitt.
Quentin Clark kicked off Day One with the keynote, "Microsoft’s Data Platform – The Road Ahead". With an audience of over 5,000 viewing in person and via live stream, he discussed how customers are pushing the boundaries of what’s possible for businesses today using the advanced technologies in our data platform. Also during his discussion, he announced the second Community Technology Preview (CTP2) of SQL Server 2014 which features breakthrough performance with In-Memory OLTP and simplified backup and disaster recovery in Windows Azure.
David DeWitt kicked of Day Two with a keynote on "Hekaton: Why, What, and How". He shared deep insights on In-Memory OLTP's ability to provide performance improvements and transactional workloads.
Outside of the great keynotes, participants attended a wide variety of technical sessions presented by our experts and industry leaders. The conference also featured Microsoft on-site certification testing,instructor-led workshops, technical guidance from Microsoft Customer Service and Support (CSS) engineers and hands-on support from Azure Customer Advisory Teams (CAT).
So there you have it -- it's a wrap on PASS Summit 2013, but keep in mind that we are already planning PASS Business Analytics Conference taking place May 7-9 in San Jose, California and PASS Summit 2014 which will take place November 4-7th in Seattle, Washington. We hope to see you there!
SQL Server 2014 CTP2 includes three SQL Server Backup and Restore enhancements. These enhancements enable using Windows Azure for backups, providing options to ease and automate backup management, and improved security for backups through encryption.
In addition, to enable backup to Windows Azure from previous versions of SQL Server, a stand-alone tool is now available. A brief description of the tool and the link to the download center is included at the end of this article.
The three enhancements are described in detail below:
Support for SQL Server backup to and restore from the Windows Azure was introduced in SQL Server 2012 SP1 CU2 with Transact-SQL, SMO and PowerShell support. In the current release, the feature is further enhanced to include UI support. Backup and Restore tasks, and maintenance plans now include URL as a destination option when backing up or restoring a database.
Following are the UI changes in SSMS:
2. If you click the Create... button, you will see a new dialog titled Backup to URL – Create Credential as shown below.
3. When you select URL as the destination, you will only see certain options enabled on the Media Options page. If you want to overwrite a backup stored in Windows Azure storage, you can do so using Transact-SQL using the WITH FORMAT option.
Maintenance Plan Wizard has similar changes to the Define a Backup Page to support backup to Windows Azure storage.
In the Restore task, you now have the option of selecting URL as the device to restore from. If you select URL, you are asked to specify the SQL Credential to authenticate to the Windows Azure storage.
2. When you select URL as the media type and click Add, you see a new dialog -Connect to Windows Azure Storage, which allows you to specify the SQL Credential to authenticate to the storage account.
Here is the link to the topic that describes the feature in detail: SQL Server Backup to URL
Built on top of the SQL Server Backup to URL technology, SQL Server 2014 CTP2 includes the option to let SQL Server create and manage the backup strategy. SQL Server creates a backup strategy intelligently, one that is based on recoverability to a point in time within the specified retention period and transaction activity on the database. The destination for the backup file is Windows Azure Storage. Once you configure this feature either for a database or for an entire instance, SQL Server manages the backup frequency for full and log backups. It supports point in time restore based on the retention time period you specify when configuring SQL Server Managed Backup. It also supports the use of encryption. Encryption is also a new feature released in SQL Server 2014 CTP2 and is described later in this blog post.
Following are sample Transact-SQL statements for configuring SQL Server Managed Backup to Windows Azure.
For a database:
Use msdb; GO EXEC smart_admin.sp_set_db_backup @database_name='TestDB' ,@enable_backup=1 ,@retention_days =30 ,@credential_name ='MyCredential' ,@encryption_algorithm ='AES_256' ,@encryptor_type= 'Certificate' ,@encryptor_name='MyServerCert01'; GO
For an instance of SQL Server:
Use msdb; GO EXEC smart_admin.sp_set_instance_backup @enable_backup=1 ,@retention_days=30 ,@credential_name='sqlbackuptoURL', ,@encryption_algorithm ='AES_128' ,@encryptor_type= 'Certificate' ,@encryptor_name='MyBackupCert'; GO
Important: To create a backup certificate to use when configuring smart backup, use the steps described in the Encrypted Backup section below.
Here is the link to the topic that describes the feature in detail: SQL Server Managed Backup to Windows Azure
To secure you backup data, you can now select to encrypt when you create a backup. The encryption options include an encryption algorithm and a certificate or asymmetric key to be used for the encryption. Only asymmetric keys residing in the Extended Key Management is supported.
Multiple encryption algorithms up to AES 256 bit are supported giving you a variety of algorithms to choose from. Adding backup encryption to a TDE encrypted database, gives additional protection for your data.
You can control the process using Transact-SQL, SMO, PowerShell, or SSMS. Encryption options can also be used when configuring SQL Server Managed Backup to Windows Azure, therefore providing additional security for your off-site data.
It is very important that the certificate or key used for encrypting must be backed up to a different location than the backup that the certificate or key is used to encrypt. Without this certificate or key the backup file cannot be restored.
Important: Restore validates the thumbprint of the certificate during the restore operation. Therefore, the certificate used to create the backup must be retained in its original state. Renewing the expiry date for example changes the thumbprint and thus can render the certificate unusable.
Below are some simple steps you can follow to test out the feature.
1. Create a Database Master Key for the master database on the instance
USE master; GO CREATE MASTER KEY ENCRYPTION BY PASSWORD = '23987hxJ#KL95234nl0zBef';
2. Create an encryption certificate
Use Master GO CREATE CERTIFICATE MyTestDBBackupEncryptCert WITH SUBJECT = 'MyTestDB Backup Encryption Certificate';
3. Backup a database and choose the encryption option, specify an encryption algorithm, specify the certificate to use.
BACKUP DATABASE [MyTestDB] TO DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\Backup\MyTestDB.bak' WITH ENCRYPTION ( ALGORITHM = AES_256, SERVER CERTIFICATE = MyTestDBBackupEncryptCert ), STATS = 10 GO
The Backup Options page of the Back Up Database Task dialog has the new encryption options.
Here is the topic that describes the feature in detail: Backup Encryption
This stand-alone tool can be configured to redirect SQL Server backups to Windows Azure storage. To be used for SQL Server versions that do not have the built in ability to backup to Windows Azure, it also includes the options to encrypt and compress during backup.
For more information, see the download page.
Microsoft recently announced several new ways to back up and recover SQL Server databases with Windows Azure. These features, now available in SQL Server 2014 CTP2 and as a standalone tool for prior versions, provide an easy path to cloud backup and disaster recovery for on-premises SQL Server databases. The capabilities for backing up to Windows Azure Storage help to reduce storage costs and unlock the data protection and disaster recovery benefits of cloud data storage.
Benefits of the new SQL Server Backup to Windows Azure include:
Simplified Cloud Backup
There are several new capabilities enabling users to utilize SQL Server Backup and Disaster Recovery Windows Azure.
Learn More
SQL Server SQL Server 2014’s goal is to deliver mission critical performance along with faster insights into any data big or small. At the same time, it will enable new hybrid cloud solutions that can provide greater data protection and positively impact your bottom line. Early adopters are already leveraging new hybrid scenarios to extend their backup and disaster recovery capabilities around the globe without the need for additional storage replication technologies.
If you would like to try the Backup and Recovery Enhancements in SQL Server 2014 CTP2, Thursday’s blog post will help you get started configuring and using these new capabilities. You can also preview the SQL Server Backup to Windows Azure Tool, enabling backup to Windows Azure for SQL Server 2005 and forward.
You can learn more about Hybrid Cloud scenarios in SQL Server 2014 by reading the SQL Server 2014 Hybrid Cloud White Paper, which is part of the SQL Server 2014 Product Guide.
There’s a lot of excitement around the In-Memory OLTP database engine in SQL Server 2014—and for good reason. Many companies are already realizing impressive performance gains by using it. Here are just a few examples:
Edgenet manages data for Fortune 500 firms, and its success depends on updating clients’ databases so that they accurately reflect transactions, especially for databases supporting online shopping carts. By upgrading to SQL Server 2014, Edgenet boosted throughput sevenfold with In-Memory OLTP. Read the case study here.
SBI Liquidity Market manages more than US$1 trillion each month in foreign currency exchange trading. To increase scalability and speed, it upgraded its trading system to SQL Server 2014. With In-Memory OLTP, SBI Liquidity Markets improved throughput by 10x, processing up to 200,000 transactions per second. Read more about this story here.
TPP manages about half of the United Kingdom’s 60 million medical records through its hosted clinician service. Fast access to medical records is critical. In the most extreme scenarios, if a doctor can’t access the system fast enough, patient’s lives could be at stake. When TPP started testing In-Memory OLTP by upgrading its databases to SQL Server 2014 managed to get the solution up and running in only half a day. Although the databases were running on existing servers, TPP observed 7 times performance improvements. Further details on their story are here.
While In-Memory OLTP is transforming performance, it’s just one feature in the broader Microsoft platform that’s supporting modern applications’ demand for intense scale and ease of access.
This end-to-end platform is what sets Microsoft apart, and that’s why I’d like to echo some points recently made by Satya Nadella, Microsoft’s Executive Vice President of Cloud and Enterprise. In 10 years, I believe Microsoft will still be at the core of computing because the company is continuing to shape the industry with innovations such as the hybrid cloud based on its Cloud OS. Microsoft also offers fast data insights with BI tools that “led the pack” according to Forrester Research. Ongoing adoption of Microsoft’s platform by the world’s top companies is also testament to the company’s vision.
For more information about In-Memory OLTP including its no locking/no latching concurrency controls, read this article. SQL Server 2014 CTP2 is now available for download here.