Official News from Microsoft’s Information Platform
Machine Learning Blog
In this post I want to spend some time discussing “in-memory” database technologies and Microsoft’s technical roadmap in this space. In-memory database technologies are approaching a disruptive tipping point for the database industry and we’re going to hear a lot more about them going forward. These technologies are already reshaping the analytics and reporting segment and they will increasingly impact operational and transaction processing workloads as well.
First, a quick review. Virtually all database management systems (DBMS) try to keep portions of a database resident in RAM. The DBMS manages the transfer of disk based data to and from random access memory (RAM) which is allocated to a region often called a “buffer pool”. For certain scenarios, you can even allocate enough buffer pool space to fit an entire database into RAM. While this can be a viable way to operate a performance sensitive database, it is not what is meant by an “in-memory” DBMS.
Traditional DBMSs were designed to support databases which are typically much larger than available RAM. In-memory DBMSs, on the other hand, are typically designed to support databases which fit in RAM while a number of them support databases which can be larger than available RAM.
In-memory DBMSs have been around for some time. They were originally employed in performance sensitive applications serving telephony and financial services markets. There is a resurgence of interest around in-memory database technology and we are starting to see in-memory DBMS technology reach a disruptive tipping point for a number of scenarios.
Disruptive tipping points occur when a technology comes along and provides a significant improvement in service along several dimensions. Often the disruptive technology has been available for some time and has served more niche scenarios before a set of factors converge to enable a 10x or more improvement for a broader class of scenarios. Consider the recent disruption when non-volatile Flash memory displaced hard disk drives (HDDs) in personal electronic devices such as MP3 players. In a single generation of these devices, the cost, capacity, power and improved physical characteristics of Flash memory displaced hard disks. Flash memory wasn’t new but the price, performance, and durability converged to a point where HDD storage was no longer attractive for this scenario.
In the rest of this post we’ll cover the following:
The last several decades have been breathtaking for information technology advances. The landscape for people designing database systems is constantly changing because some aspects of the technology components continue to advance with exponential change, such as transistor density with Moore’s law, while other aspects are tconstant or evolve with near-linear growth, such as HDD random access time. When these change rates are considered in total, the natural balance between technology elements can change dramatically. In particular, the “speeds and feeds” of hard disks, RAM, and CPUs have evolved dramatically over the last 10-20 years, setting us up for the in-memory inflection.
Most database systems were designed to operate from the ubiquitous HDD which has been subject to physical limits since its inception. In particular, the electromechanical elements to position a read/write unit over the physical media have been a constant challenge. While disk transfer bandwidth and capacity have experienced exponential growth over the last few decades, the time to place the read/write head in the right place has improved far less so. As a result, the sequential bandwidth of reading from disk has experienced exponential growth, while the ability to randomly access small amounts of data on disk has not. This shift led the late Jim Gray, one of the pioneers of the database industry, to proclaim, “Disk is the new tape”.
Dynamic RAM (DRAM), on the other hand, has been riding Moore’s law since its introduction. It has gotten cheaper and its capacity has increased dramatically. Like the HDD, system designers have been able to exponentially increase the bandwidth to and from DRAM but latency, the ability to randomly access a single byte, hasn’t been able to keep pace. This means that the latency to access a random byte of DRAM, measured in CPU processor cycles, has gone from near one CPU cycle 15 to 20 years ago to hundreds of CPU cycles with modern systems.
Moore’s Law still holds in the CPU arena but the designers have run into physics there as well. Up until about 5 years ago software developers got a free ride through faster clock speeds and improved architecture of CPUs. Unfortunately CPU power consumption is not linear with respect to clock speed and increasing clock speeds creates more challenges keeping elements of the processor coherent with respect to the clock. That forces CPU designers to keep things closer together. More power, closer together, created a thermal management problem which essentially put an end to ever increasing clock speeds. Since chip designers still have Moore’s law working for them, they can continue to put more transistors on a CPU die and these transistors need to be used somehow. The result, as we’ve all experienced over the last several years, is more processor cores, more on-board cache, and more specialized functions and instructions. All of these new CPU transistors can be put to good use for in-memory database scenarios but they require a new design approach.
One common theme across all of these changes is that bandwidth and capacity tend to increase much faster than reduction in latency. Latency tends to hit us from both electromechanical fronts, such as hard disk drive seek time, as well as speed of light physics which impacts networks, DRAM latency, etc.
To really understand the dynamics we need to look at these elements from a relative perspective:
In essence – everything is moving farther apart. Today it takes 100s of processor cycles to fetch a random byte from DRAM and many millions of cycles to access a random byte on a hard drive.
The answer, in as much as there is one, to creating high performance systems in the face of these shifts, is twofold: 1) bring things closer to CPU to reduce latency, and 2) stream data as much as possible since bandwidth is continuing to keep up whereas latency is hitting the limits imposed by physics. These forces, set up outside the world of databases, are leading the database implementers to rethink the design approaches they’ve used for decades.
I’ll briefly describe several architectural patterns that designers of in-memory database systems employ. Fully describing the patterns I touch on could fill a book but a brief mention will highlight the differences from more traditional database technologies.
Many in-memory database systems employ data compression of some form. This helps to hold more data in the relatively scarce and relatively expensive DRAM. Decompressing the data, when necessary, to operate on it does take processor cycles, but, considering that getting uncompressed data off of a disk may take many million processor cycles, keeping more compressed data in RAM and then spending a few CPU cycles to decompress the data from there makes a lot of sense these days.
Foundational database algorithms are being revised to support in-memory needs. One approach, known as column based storage, is particularly powerful on modern systems. Most traditional database systems store data in records or rows. In these systems, an employee record may contain a first name, last name, employee ID, etc. and these attributes would all be stored together on-disk and in-memory. Column based approaches store all the values for individual attributes, such as employee ID, last-hire date, last name, together whether on disk or in-memory. Given this structure you might ask how we put “John Doe” back together to display all his attributes if they’re all stored separately. The answer is that each attribute (column) typically has an implicit ordinal number. So, if John Doe is the 56th record in the database we can find the 56th entry in each column storage set and get John Doe’s details.
So how does column storage help us? Let’s assume we wanted to find all employees who were hired in the last year. Finding these employees means searching the “last-hire-date” column. Since we placed all of the list-hire-date values together and likely compressed them as well, the data structure which stores this attribute is very dense – it only contains the data we need to do the search and we don’t have to wade through first and last names, etc. For reasons I won’t go into here, modern CPUs work especially well when scanning through dense data structures. With respect to compression; let’s say that our employees can have one of three states in an employee-status attribute – active, on-leave, or terminated. These three states can be represented with 2-bits each rather than 8 bits if we were to use a byte to represent the values. Let’s also assume that we have 1200 employees and 1104 of them are “active”. If we partition or sort the employees we can represent all 1104 active employees in a very compact representation using simple scheme called “run-length encoding” – basically noted that ‘the next 1104 employees all have the value of “Active”’. There are other more exotic compression schemes used in column based stores but these simple examples serve to make the key points.
Column based storage also lends itself to multi-core processors. Assume we wanted to find all terminated employees whose last hire date was within 18 months. To solve this query we could have one core find all terminated employees and another work on finding those employees with the appropriate hire date.
Locating those records which satisfy both the hire date and employee status criteria is as simple as intersecting the record ordinals which pass both criterion.
Column based storage techniques work particularly well in analytic workloads which are dominated by numeric values and dimensions such as the employee-status attribute we discussed earlier. These workloads have been the domain of multi-dimensional OLAP (MOLAP) products, such as the original form of SQL Server Analysis Services. Traditional MOLAP products required the definition of a logical model so that aggregates, such as sales per quarter or sales per region could be pre-computed and used to rapidly answer questions of appropriate form. The logical model specifies the information model and types of questions you can ask of the data. The logical model also becomes the skeleton by which we can define a physical data model, such as where to pre-compute and pre-aggregate data to support dimensional queries. The raw performance of in-memory database technologies obviates the need for precomputing aggregations and other forms of indexing. With in-memory database technologies, the logical model can evolve based upon the needs of the business domain without having to update a physical model. This combination enables business analysts to create a logical model which meets the business needs without having to become experts in the physical design aspects required of MOLAP products. This results in greater agility and is a foundational part of Microsoft’s self-service managed BI approach.
Microsoft has been investing in, and shipping, in-memory database technologies for some time. We have created a column based storage engine which ships as part of the “PowerPivot” add-in for Microsoft Excel. In SQL Server 2012, this ships as the xVelocity in-memory analytics engine as part of SQL Server Analysis Services. This same engine, integrated into the SQL Server RDBMS server ships as the xVelocity memory optimized columnstore index. This columnstore index will make its way into our parallel data warehousing product as well. One SQL Server 2012 customer was able to achieve a 200x speedup through the use of this new in-memory optimized columnstore index type. You can read about the case study here.
In addition to the obvious performance gains, there are many advantages to using a common engine. In the case of Power Pivot and Analysis Services there is a common Data Analysis Expressions language, known as DAX, which allows analytical queries to run identically in Excel, Excel Services or SQL Server Analysis Services. This is incredibly valuable as it allows business analytics models developed Microsoft Excel to run identically in a server based Analysis Services deployment. In SQL Server 2012 Analysis Services there is a new tool which allows you to very easily upsize an Excel based PowerPivot solution to a server based Analysis Services solution. This allows analytic solutions to be developed in a self-service approach within Excel, and then easily converted to a managed IT solution as needed.
Most of what we’ve covered in this post is about column based in-memory engines designed for analytic and data warehousing workloads – something we’ve been shipping for years. While column based stores are great for these scenarios, they are not optimal for transaction processing workloads which are characterized by inserting, modifying, or reading a single or small number of records at a time. Having to assemble individual records from multiple attributes, as necessary with column based storage models, is typically more expensive than row-based storage models. While most of the current success in in-memory database technologies has been through column based storage for analytic workloads, there are in-memory optimizations which apply to row-based transactional workloads as well. Microsoft is investing in this space as well and has conducted experiments with customers which have achieved greater than 10x speedup on existing transaction processing tasks – more on this in the future.
In-memory database technologies offer incredible advances but ultimately the highest value is achieved when these capabilities are delivered as a part of a complete data platform. There are benefits in terms of the “fundamentals” (management, security, information governance) but also how this enables these capabilities to be more easily used – by both existing and new applications. In terms of what we’ve done so far, by delivering as an index in the relational engine, integrating into Analysis services and delivering an experience for Excel users with PowerPivot, we have an easy on-ramp for existing applications as well as enabling whole new scenarios like self-service business intelligence on the desktop. Our technical approach considers not only best-in-class capability but how to integrate and deliver as a part of a complete data platform – ultimately the highest value for customers will be achieved this way.
A set of forces shaping information technology are setting up conditions for a tipping point where in-memory database technologies will become common over the next 5-10 years. Analytic workloads, dominated by read-mostly queries over numeric facts or numerically encodable dimensions are the first scenario where in-memory technologies are becoming widely adopted. Over time, in memory technologies will also be applied to transaction processing workloads.
You’ll likely hear a lot more about in-memory database technologies going forward. At Microsoft, we believe in this technology inflection point and have been shipping in-memory database technologies for some time in Microsoft Excel Power Pivot and SQL Server Analysis Services. With the introduction of SQL Server 2012, the core relational database engine will also make use of our in memory database technologies as a columnstore based index. Customers with relational data warehouse workloads using this new column based index are experiencing incredible speedups for a broad class of queries making use of this new capability.
Microsoft is also investing in other in-memory database technologies which will ship as the technology and opportunities mature. As a taste of what’s to come, we’re working on an in-memory database solution in our lab and building our real-world scenarios to demonstrate the potential. One such scenario, based upon one of Microsoft’s online services businesses, contains a fact table of 100 billion rows. In this scenario we can perform three calculations per fact – 300 billion calculations in total, with a query response time of 1/3 of a second. There are no user defined aggregations in this implementation; we actually scan over the compressed column store in real time.
In memory database technologies make it an exciting time for both the people building these systems and the customers who use them. On behalf of the SQL Server product engineering team, I hope you have an opportunity to use our in-memory database technologies to solve new challenges in your own organization.
Dave Campbell Technical Fellow Microsoft SQL Server
 David A. Patterson. 2004. Latency lags bandwidth. Communication of ACM Vol 47, 10 (October 2004), pp. 71-75.