Official News from Microsoft’s Information Platform
In-memory OLTP provides full durability for memory-optimized tables. When a transaction that has made changes to memory-optimized table commits, SQL Server, just like it does for disk-based tables, guarantees that the changes are permanent and can survive the database restart provided the underlying storage is available.
There are two key components of durability. First is the transaction logging and the second is persisting changes to data to on-disk storage. Let us look at each of these in the context of disk-based and memory-optimized tables.
Transaction Logging: All changes made to disk-based tables or durable memory-optimized tables are captured in one or more transaction log records. When a transaction commits, SQL Server flushes all the log records associated with the transaction to disk before sending communicating to the application or user session that the transaction has committed. This guarantees that the changes done on behalf of transaction are durable. The transaction logging for memory-optimized tables is somewhat different than disk-based tables with the key differences described below.
These differences make logging for memory-optimized table more efficient while still guaranteeing the full durability.
Persisting changes to storage: The transaction log records all the changes that ever happened in a database starting from the database was created and can used to re-construct the database assuming the transaction log was truncated. However, this would be very inefficient as SQL Server will need to apply all the transaction log records sequentially since the database was created thereby making recovery time (RTO) unacceptably high. To avoid this, SQL Server and other database systems, take periodic checkpoints that flush operations to durable storage and allow log truncation to reduce the amount of transaction log that must be replayed after a crash. Before describing how data is persisted for memory-optimized tables, let us first look at how data is persisted for disk-based tables and what are the performance implications.
Performance Bottleneck: The access to data/index pages generates random IO thereby reducing the IOPS available from rotating media. Also, the checkpoint operation can potentially cause significantly high IO activity that can impact the production workload negatively. With Indirect-checkpoint feature available as part of SQL Server 2012, the amount of IO done as part of checkpoint is reduced but still the fact remains that the IO is random.
The persistence of memory-optimized tables is done with a set of data and delta files using a background thread (described later). These files are located in one or more containers leveraging the same mechanism as used for FILESTREAM data. These containers are mapped to a new type of filegroup, called Memory_Optimized filegroup. For example, the following command adds a memory_optimized filegroup with one container to the database ContosoOLTP.
CREATE DATABASE ContosoOLTP on PRIMARY (NAME = [contoso_data], FILENAME = 'C:\data\contoso_data.mdf', SIZE=100mb) LOG ON (name = [contoso_log], Filename='C:\data\contoso_log.ldf', size=100MB) ----- Enable database for memory optimized tab--les -- add memory_optimized_data filegroup ALTER DATABASE ContosoOLTP ADD FILEGROUP contoso_mod CONTAINS MEMORY_OPTIMIZED_DATA -- add container to the filegroup ALTER DATABASE ContosoOLTP ADD FILE (NAME='contoso_mod', FILENAME='c:\data\contoso_mod') TO FILEGROUP contoso_mod
As indicated earlier, there are two kinds of files as follows
A data file contains rows from one or more memory-optimized tables inserted by multiple transactions as part of INSERT or UPDATE operation. For example, one row can be from memory_optimized table T1 and the next row can be from table T2. Each data file is sized approximately to 128MB. The rows only appended to the data file exploiting the sequential IO. Once the data file is full, the rows inserted by new transactions are stored in another data file. Over time, the rows from durable memory-optimized tables are stored across one of more data files and each data file containing rows from a disjoint but contiguous range of transactions. For example a data file with transaction commit timestamp in the range of (100, 200) has all the rows inserted by transactions that have commit timestamp in in this range. The commit timestamp is a monotonically increasing number assigned to a transaction when it is ready to commit. Each transaction has a unique commit timestamp.
When a row is deleted or updated by a future transaction, the rows is not removed or changed in-place in the data file but the deleted rows are tracked in another type of file ‘delta’ file. Update operations are processed as delete/insert of the row. This eliminates random IO on the data file.
Each data file is paired with a delta file that has the same transaction range and tracks the deleted rows inserted by transactions in the transaction range. For example, a delta file corresponding to transaction range (100, 200) will store deleted rows that were inserted by transactions in the range (100, 200). Like data files, the delta file is accessed sequentially.
Populating the Data and Delta files
The user transactions don’t actually append to data or delta file. This is done by a background thread, called offline checkpoint worker. This thread reads the transaction log records generated by committed transactions on memory-optimized tables and appends inserted and deleted rows into appropriate data and delta files. Unlike disk-based tables where data/index pages are flushed with random IO when checkpoint is done, the persistence of memory-optimized table is continuous background operation.
Checkpoint for Memory-Optimized Tables
The checkpoint for memory –optimized tables is done when the transaction log grows larger than an internal threshold (currently set to 1GB) since the last checkpoint. The checkpoint operation is completed with the following steps
During crash recovery, SQL Server locates the last completed checkpoint from the transaction log, loads the data and delta files and then applies the active part of the transaction log to bring memory-optimized tables to the current point-in-time.
For more information, download SQL Server CTP1 and get started today, or see more blogs in the series introduction and index here.