Official News from Microsoft’s Information Platform
Machine Learning Blog
As the SQL Server 2014 In-memory OLTP team began working with customers we recognized some common design patterns where the In-memory OLTP engine was a great fit. The intent of this blog is to call out one of those architectures which we call: High data input rate/Shock Absorber and how implementing SQL Server 2014 In-memory OLTP can help improve the performance of applications with this scenario.
What are some characteristics of this architecture?
A common application pattern we have experienced in working with customers is characterized by the need for the database to ingest a very high input rate of data. There are a wide variety of patterns, where the number of sources of data, frequency in which the data is input (burst or steady stream) and requirements around the amount of data input varies. However, the common characteristic is the need to handle either spikes or a constant rate in the incoming workload which may significantly exceed the capabilities of a classic RDBMS. We call this “Shock Absorber” as the scenario includes one or a few tables which are able to “absorb” this input rate (which is the “shock” to the system). Typically this part of the scenario may be a piece of a larger application architecture involving other tables, data and queries. However, in many cases this ingestion rate has become a primary bottleneck within the application.
There are also a number of applications which have exhibited more of an ETL scenario where the data might not just be inserted, but updates, deletes (and data transformation) may occur. In this case a common pattern would be to have a “staging” table to handle the initial load or loading directly into the end-table and scaling the writes and reads concurrently. The bottleneck around ingesting the data remains the same.
There are a wide variety of business solutions which we have found with this type of pattern and some intriguing scenarios where implementing In-memory OLTP can improve the overall application performance significantly.
How can In-memory OLTP help?
These scenarios can benefit from In-memory OLTP in a number of ways. First, in a number of cases the “shock” or overflow of the input workload is bottlenecked on latching or locking contention within the table structure. In some cases, people will architect around last page insert scenarios via reverse indexes or secondary indexing implementations. However they can only scale to a point and can lead to other consequences. With memory-optimized tables all the latching and lock contention is eliminated due to the way data is accessed in-memory and row-versioned. This can provide for much greater scale in particular when running into these contention points.
Another common bottleneck in these scenarios is due to the overhead of the log writes which can extend the time it takes to complete a transaction. This may be a bottleneck around IO capacity of the disk or just the amount of logging as an overhead. The advantages of memory-optimized tables in this case would include:
Furthermore, in cases where the transaction throughput into the database is unable to “keep-up” with the input and additional scale (i.e. adding additional clients or threads) are not able to help or be easily implemented, native compiled stored procedures can help improve the latency characteristics of the T-SQL execution. Other content may address this further, but it is important to call-out that the more performance critical code one can place into the native compiled stored procedure and the In-memory OLTP engine in general, the more chance you have to improve the overall execution performance.
My database doesn’t all “fit” in memory
In many scenarios where memory-optimized tables and native compiled stored procedures can be utilized to target the “hot” OLTP dataset where the bottleneck is exhibited there may be other data which is needed in the database but not necessarily desired to be stored in the In-memory OLTP engine. Some examples would include 1. A need for archive data 2. The query workload pattern is not OLTP and/or different indexing structures are required to satisfy the workload or 3. Physical memory limits would require some tables to be disk-based (standard b-tree) for larger datasets.
As In-memory OLTP is integrated into SQL Server and you can have both memory-optimized tables and disk-based tables in the same database, separating the data storage and retrieval mechanisms between these table structures is also a common pattern. Typically a user created “background” process can execute, for example, on a timed basis, to move data from the memory-optimized table to the disk based table and remove that data from the current memory-optimized table (or potentially just purge).
Ok, but how much does it really help?
Great question! As this is a common scenario we have worked with customers doing testing and here are some early results:
So, do you have scenarios you would like to use a relational database for, but running into issues with regards to the amount of data you can ingest into tables? Consider SQL Server In-memory OLTP engine for meeting your transaction volume and application performance with the database in the critical path.
Interested in more patterns where we have seen success with SQL Server 2014 In-memory OLTP? Stay tuned for a follow-up whitepaper which will include more scenarios and considerations to help you migrate to SQL Server 2014 In-memory OLTP.
Download SQL Server CTP1 and get started today, or see more blogs in the series introduction and index here!