Official News from Microsoft’s Information Platform
We believe that the In-Memory OLTP engine advances the industry state of the art with respect to concurrency control. The main reason for this advancement is due to the combination of lock free algorithms and the row-versioned architecture of the engine.
This post examines what we precisely mean when we describe the In-Memory OLTP engine as being ‘lock free’, both in abstract terms but more importantly in terms of impact on user workloads.
Let’s start with a brief definition capturing the core attributes implied by the term ‘lock free’ in In-Memory OLTP:
“At steady state, transaction carrying threads executing in the context of the In-Memory OLTP engine are designed to require no blocking operation.”
Element by element, this definition implies the following exclusions:
Once we agree on the definitions above, we note that in the context of any database system, locking behavior always implies two distinct considerations: we use locks for logical data protection (also known as transactional isolation) and for physical data protection.
One example of logical data protection is row payload protection. Row payload protection means that while a user modifies a row another user does not modify the same row. This is what logical (transactional) locks are used for in traditional SQL Server (row locks, page locks, database locks and even app locks can be used for the same purpose). With In-Memory OLTP we rely on row versioning to ensure that row content is never modified by two users at the same time – in other words we don’t use transactional locks because we never update data in place. If two or more users try to update the same row at the same time, one will succeed while the others will fail due to a write/write conflict. Note that this can be achieved without any locking by creating a new version for each user and then trying to install each version atomically in the same index location (by using ‘InterlockedCompareExchange’ – or ICX). Out of the multiple users that try to update the same row at the same time, one will succeed and the others will fail this hardware level ICX operation – and that translates directly into the behavior reported back by the system.
Physical data protection is a different problem altogether. In traditional database systems (SQL Server but also more broadly in the industry) we protect internal data structures via spinlocks and latches. Broadly speaking (and oversimplifying), the difference between spinlocks and latches is that a thread trying to acquire a spinlock will spin when the lock is found to be currently held by a different thread whereas for latches the acquiring thread will yield its CPU time back to the OS if the latch is found to be currently held by a different thread. Given this difference, traditional SQL uses latches for waits that could take a while (getting a page in the buffer pool) while spinlocks are used for short term waits (waiting for a memory only linked list traversal for instance). However, both locks and latches have in common the fact that they are ‘region locks’. We use the term ‘region lock’ to describe the mechanism used to protect a region of code or data structures against simultaneous thread access. Region locks implement an ‘Acquire/Release’ pattern – where a lock (either latch or spinlock) is first acquired, the protected region executes, and then the lock is released. The problem with that approach is that it does not scale. In a system with many cores or very high concurrency the region being protected becomes a bottleneck. For instance, in Windows Server the scheduling quantum is around 180ms, so if a thread that holds a spinlock gets preempted, that spinlock will be held for 180ms regardless of how short the protected region would be otherwise. There are other negative side effects from region locks because they all involve writing to shared cache lines even when the lock is acquired for read access – which becomes problematic in many-core and NUMA systems or under high concurrency. The lock free engine avoids these issues by implementing all operations in an atomic fashion. In other words, the In-Memory OLTP engine does not define any protected regions in transaction executing paths. The data structures and algorithms are structured such that state transitions are atomic and therefore are not subject to the whims of the scheduling subsystem. In addition, many operations are done without any shared-cache line modifying instructions at all (meaning that the entire operation does not even use ICX but rather touches in write mode only cache lines that are private to the local processor) which improves scalability and concurrency to the limits supported by the hardware.
The prime example of that is index traversal: the engine walks both hash and range indices without any locks or ICX instructions. In the process the engine detects if the underlying data structure has changed in a manner that could invalidate the current traversal and re-attempts the small portion of the traversal that was invalidated. These re-traversal are extremely rare even at very high concurrency (in the early days of the In-Memory OLTP engine we have measured under 100 retries for million tx / sec workload) – so their measurable performance impact is virtually non-existent. When the engine needs to modify one of these lock free data structures it does so via ICX – which makes the modification visible atomically.
A careful observation of current hardware trends presents overwhelming evidence that the number of available cores in any given computing platform is likely to rise with time. In this context, concurrency control that is at its core lean and efficient is absolutely crucial to achieving first-rate performance. With In-Memory OLTP we have taken these insights to heart and built an engine that relies on no locks, waits, latches or other synchronization primitives to ensure consistency of execution. We believe this approach removes locking and latching as a concern for even our most demanding users.
For more information, download SQL Server CTP1and get started today, or see more blogs in the series introduction and index here.